Abstract

The neotropical savanna is the second largest biome in South America, with significant potential for agricultural development. In Colombia, this biome is experiencing rapid land-use change leading to the conversion of seminatural landscapes into to intensive agricultural systems. Our Dataset Paper documents the emerging intensive grain production systems. Between 2011 and 2013, we established 336 observatory plots within farmer’s maize, rice, and soybean fields along a 200 Km transect from Puerto Lopez (Meta) to Viento (Vichada). From each of these plots, we submit 184 descriptors or variables capturing their location, rotation history, management, and environment. Our specific objective in collecting the data was to identify key factors explaining yield variation, with emphasis on interactions between management and environmental factors potentially informing the development of site-specific management protocols. Beyond this objective, the dataset submitted here is intended to support additional inquiries contributing to the sustainable development of agriculture in the neotropical savannas.

1. Introduction

The neotropical savanna is the second largest biome in South America, occupying about 250 million hectares of land [13]. Its soils are notorious for their high acidity and aluminum levels toxic to most crops. Still, Nobel Prize Laureate Norman Borlaug called it “the last agricultural frontier in the world” [4]. Indeed, between 1955 and 2005 the Brazilian savannas experienced an extraordinary frontier expansion leading to the cultivation of over 40 million hectares of land previously considered infertile [5]. Reflecting on this achievement, Borlaug envisioned a similar transformation for the savannas in Colombia, Venezuela, and Southern Africa [5]. Contributing to his vision, our Dataset Paper documents the ongoing land-use conversion of seminatural savannas into intensive grain production systems in Colombia.

Our objective in collecting the data was to help identify key factors explaining yield variation in maize, rice, and soybeans on farmers’ fields. From an operational standpoint, we recognized that these factors were of two types: those that easily lend themselves to agronomic manipulation and those that do not. The first group includes variables like soil pH, which can be adjusted relatively easily by liming. We call this group management factors. The second group includes variables like soil texture, which cannot be changed. We call this group environmental (or zoning) factors. We further recognized that the influence of some management factors on yield may depend on one or more environmental factors. For example, the same amount of irrigated water may cause yield improvements in sandy soils and may cause the root to rot in clay soils. Characterizing these types of interactions between management and environmental factors is the foundation of site-specific agriculture. Aware of this potential, our study was designed to provide a stepping stone for the development of site-specific grain agriculture in the Colombian savannas.

Encouraged by multiple requests of our dataset to address additional research questions, we are pleased to formally present it to our community of interest in this Dataset Paper. The objective of submitting this dataset, therefore, is to encourage and support diverse research inquiries contributing to the sustainable development of agriculture in the neotropical savannas.

2. Methodology

The study was conducted in the Colombian savannas, locally known as “Llanos Orientales,” a region that extends from the Meta Department to the Venezuelan border (Figure 1(c)). Its climate is characterized by a wet season that begins in March and a dry season that begins in December, with an average annual temperature around 26°C [6, 7]. The length of the wet season accommodates two planting seasons for grain crops, one around April and another around September. Soils are mainly Oxisols with low fertility and high acidity and Al saturation [6, 8]. A good ecological characterization of the region is provided by Blydenstein [9].

Our method involved the establishment of observatory plots within farmer’s maize, rice, and soybean fields along a 200 Km transect from Puerto Lopez to Viento from 2011 to 2013. We call these plots “EGM,” after the Spanish acronym for Georeferenced Sampling Station (Figures 2(a) and 2(b)). EGMs were 20 m2, a size we chose because it facilitated intensive sampling and matched the experimental field size used by the Colombian Ministry of Agriculture to evaluate and register new cultivars. A series of farm visits, involving unstructured interviews with farm managers and guided field inspections, helped us survey the variability between and within fields with respect to topography, soil texture, rotation history, and yield history. During the inspections, we consultatively established two or more EGMs per field, in such way that captured the greatest perceived variability with respect to the above-mentioned factors. Our sampling within fields was therefore not random but was designed to increase statistical variance with respect to yield and a few of its potentially important environmental determinants.

We relied on three sources of data: farm records, direct measurements, and geographic information systems (GIS) databases. Farm records helped us capture rotation history, crop cultivar, and planting dates. Direct measurements helped us capture soil parameters, plant density, and yield. Immediately before the planting season, we collected three soil subsamples (Figure 2(c)) along a diagonal transect across the EGM, at depths of 0–10 cm and 10–20 cm, and bulked them into a single sample per depth profile. These samples were submitted for chemical analyses to the soil laboratory at the International Center for Tropical Agriculture (CIAT). In addition, core samples of 100 cm3 volume (Figure 2(d)) were taken from near the center of the EGM, at depths of 0–10 cm and 10–20 cm, and submitted for physical analyses to the Soil Laboratory at the Colombian Corporation of Agricultural Research (Corpoica). Plant density, yield, and grain moisture were measured within two weeks of the field’s intended harvest date. We harvested the EGM manually to measure yield and grain moisture content (Figure 2(b)). We used these two values to adjust yield based on the moisture content desired for storage (i.e., dry yield), which is 14.2% for rice and maize and 12.2% for soybeans. Finally, EGMs were georeferenced using geographic positioning system receivers (GPSMap 76CSx; Garmin, Olathe, Kansas, USA), and the coordinates were used to retrieve 250 m normalized difference vegetation index (NDVI) data from the Moderate Resolution Imaging Spectroradiometer (MODIS, [10]), precipitation data from the Tropical Rainfall Measuring Mission (TRMM; [11]), and interpolated climate data from WorldClim [12].

3. Dataset Description

The dataset associated with this Dataset Paper consists of 2 items which are described as follows.

Dataset Item 1 (Table). Data of the 336 observatory plots (EGM) within farmer’s maize, rice, and soybean fields with 184 descriptors or variables capturing their location, rotation history, management, and environment at Colombian savannas (Llanos Orientales). Each row corresponds to an EGM, and each column corresponds to a descriptor or variable. Broadly, there are five categorical descriptors for location at different scales (storage type: character) and two variables for geographic coordinates (storage type: float), one for plot area (storage type: float), four for rotation history (storage type: character), two for the crop and cultivar sown (storage type: character), five capturing the temporal dimension of the production event (storage types: integer, character, and date), three capturing plant density (storage types: float and integer), one for grain moisture (storage type: float), two for yield (storage type: float), 63 for soil physical and chemical properties at two soil depth profiles (storage type: float), 29 for precipitation data retrieved from TRMM (storage type: float), and 67 for temperature data retrieved from WorldClim (storage type: float). The missing values are represented by blank cells. In the table, the column Grain Yield Standardized presents the grain yield standardized percentage of moisture content desired for storage. Also the column Mean Diurnal Range was calculated as (mean of monthly (max temp − min temp)), the column Isothermality as (BIO2/BIO7) (100), the column Temperature Seasonality as (standard deviation 100), and the column Temperature Annual Range as (BIO5 − BIO6). The column Rainfall Seasonality was measured by coefficient of variation. For more details, see Table 1.

  • Column 1: Plot Identifier
  • Column 2: Field Identifier
  • Column 3: Farm Identifier
  •     ⋮
  • Column 182: Rainfall of Driest Quarter (mm)
  • Column 183: Rainfall of Warmest Quarter (mm)
  • Column 184: Rainfall of Coldest Quarter (mm)

Dataset Item 2 (Table). It consists of time series NDVI data of 202 EGMs (i.e., Plot ID L1) for which this reading could be retrieved.

  • Column 1: Plot ID L1
  • Column 2: NDVI Date
  • Column 3: NDVI

4. Concluding Remarks

This comprehensive Dataset Paper is submitted to support research leading to the sustainable agricultural development of the neotropical savannas. Its specific design, however, responds to our interest in identifying management by environment interactions characterizing the potential for site-specific grain agriculture in the region. Our approach is informed by the rapidly growing literature demonstrating the promise of ecoinformatics approaches to streamline agricultural research [1318]. Based on these experiences, we believe our Dataset Paper holds significant potential to facilitate a quantum leap in agricultural research for the development of the Colombian savannas.

Dataset Availability

The dataset associated with this Dataset Paper is dedicated to the public domain using the CC0 waiver and is available at http://dx.doi.org/10.1155/2015/625846/dataset.

Conflict of Interests

There is no conflict of interests in the access or publication of this Dataset Paper.

Authors’ Contribution

Soroush Parsa and Jaime Gómez Naranjo contributed equally to the study.

Acknowledgments

The authors’ most sincere gratitude goes to the farmers and farm managers that partnered with them in this study and to Santiago González Venzano from Solapa4 for generously sharing his experience and strategic vision for the project. They also thank Ximena Moreno and Nidia Zuleta for their administrative support; Mariano Tamburrino for kindly retrieving the NDVI data; Mariela Rivera and Harvey Parada for their invaluable recommendations; and Elcio Guimaraes for his strategic leadership. This project was generously funded by the Colombian Ministry of Agriculture and Rural Development.

Dataset Files

  • 625846.item.1.xlsx

    Dataset Item 1 (Table). Data of the 336 observatory plots (EGM) within farmer’s maize, rice, and soybean fields with 184 descriptors or variables capturing their location, rotation history, management, and environment at Colombian savannas (Llanos Orientales). Each row corresponds to an EGM, and each column corresponds to a descriptor or variable. Broadly, there are five categorical descriptors for location at different scales (storage type: character) and two variables for geographic coordinates (storage type: float), one for plot area (storage type: float), four for rotation history (storage type: character), two for the crop and cultivar sown (storage type: character), five capturing the temporal dimension of the production event (storage types: integer, character, and date), three capturing plant density (storage types: float and integer), one for grain moisture (storage type: float), two for yield (storage type: float), 63 for soil physical and chemical properties at two soil depth profiles (storage type: float), 29 for precipitation data retrieved from TRMM (storage type: float), and 67 for temperature data retrieved from WorldClim (storage type: float). The missing values are represented by blank cells. In the table, the column Grain Yield Standardized presents the grain yield standardized percentage of moisture content desired for storage. Also the column Mean Diurnal Range was calculated as (mean of monthly (max temp − min temp)), the column Isothermality as (BIO2/BIO7) (100), the column Temperature Seasonality as (standard deviation 100), and the column Temperature Annual Range as (BIO5 − BIO6). The column Rainfall Seasonality was measured by coefficient of variation. For more details, see Table 1.

    • Column 1: Plot Identifier
    • Column 2: Field Identifier
    • Column 3: Farm Identifier
    •     ⋮
    • Column 182: Rainfall of Driest Quarter (mm)
    • Column 183: Rainfall of Warmest Quarter (mm)
    • Column 184: Rainfall of Coldest Quarter (mm)

  • 625846.item.2.xlsx

    Dataset Item 2 (Table). It consists of time series NDVI data of 202 EGMs (i.e., Plot ID L1) for which this reading could be retrieved.

    • Column 1: Plot ID L1
    • Column 2: NDVI Date
    • Column 3: NDVI