Thursday, 13 September 2012

Automating Crop Surveillance – theory vs. praxis - part 3/3

Field Practice

With all these good ideas we set out to implement a national crop survey with NaCRRI this year.  The purpose is to test the mobile survey system that automatically updates an online map with survey data from the field in real time. As is always the case the theory underestimates the praxis – this case was no different, several issues we thought were significant and would be problematic actually turned out to be non-issues – this was the good part; the terrifying part were the non-issues that became issues. I detail several of these here spanning from the preparation, training and deployment phases of the apps on the phone and what was observed on two separate field test visits with the actual survey experts.

  1. Space/storage considerations on the phone – the image capture application uses the native phone camera application. We found that the default setting for the size and resolution of the images taken with the camera had to be changed from a 3M high-resolution image to a 1 M normal resolution image. The size of the images affects how much bandwidth will be used in uploading the images to the server and the capacity of the memory card to use in the phone. For the survey we expect each phone to take a minimum of 1000 images. So for the 8 phones we are using, approximately 10000 images.
  2. Screen/keyboard size of phone – during the training of the surveyors, the issue of usage of the phone keyboard came up. We are using the Huawei Gaga U8180 which has a display size of 240 X 30 px so its not all that big but still usable with some training. Issues of how to use the touch screen with dirty fingers (from digging up roots, etc.) also came up.
  3. Sun glare – this happened to be one of the most daunting issues. Collecting data in the gardens when the sun is all out was problematic because of difficulty in seeing the phone screen and hence difficulty in navigating the form. One solution round this was to set the brightness setting to auto on the phone. Some improvement is realized but this is something to take into consideration when buying more devices in the future. 
  4. Background clutter – for one of the field visits we tried to use a cardboard paper that was put behind each leaf before the photo of the leaf was taken. This was a very cumbersome procedure especially when trying to take an image of the whiteflies on the leaves because for the whitefly shot one needs to turn the leaf over as the whiteflies reside on the underside of the leaf. Taking the shot with the background required at least two people to execute. This was rejected and so images have to be taken with a cluttered background and the filtering done at the server end.
  5. Image capture – To score the severity of disease manifested by a plant accurately some diseases require that the examination of the whole plant is done, while for others this can be done from looking at the canopy of the plant. A 2D image of a plant unfortunately does not provide all the information for accurately determining the severity score of a plant – we thus had to settle with taking two representative images of the plant – one full plant image and a close-up canopy/leaf shot.
  6. Power/charging issues – the Huawei Gaga like most touch screen phones has about 1 day lifespan on the battery under heavy usage. We had to provide car chargers to the surveyors so the phones could be charged in the cars as they move from one garden to the next. 
  7. Server issues – Google App Engine can only support up to a limited amount of data storage and access for free. We had to enable billing to support this survey of an estimated 10000 geo-coded images.

Current Status

We are waiting on NaCRRI to sort out the necessary logistical issues related to the survey. It is expected the survey will be conducted in September 2012.

Automating Crop Surveillance – theory vs. praxis - part 2/3

The Science behind

To automate some of the expert processes we leverage some of the latest computer vision and machine learning techniques and apply them to three specific tasks in this survey.

1. Automated diagnosis

The objective here is to assess the feasibility of automated computer vision based diagnosis of CMD. Feature extraction techniques based on color and shape are used to extract features from the images of the healthy and CMD-infected cassava plants and classification using a set of standard classification methods (naive Bayes, two-layer MLP networks, support vector machines, k-nearest neighbor and divergence- based learning vector quantization) is applied to determine the diagnosis of the plant(image).  This diagnosis is either done at the server end or on the phone.
Depiction of survey process - (a) Image capture, (b) Automated diagnosis, (c) Real-time mapping

2. Automated whitefly count

Whiteflies are the major vector responsible for transmission of cassava mosaic disease and other cassava related diseases. To understand the spread of disease, experts are required to count the number of whiteflies on a leaf. This however is not a trivial task; the small size and volatility of the whiteflies results in inaccurate whitefly counts. We implemented a computer vision-based system on the mobile phone that automatically counts the whiteflies on a leaf from a photograph taken of the leaf or a video of the leaf image taken in real time.

Whitefly detection and counting

3. Determination of level of necrosis in diseased roots

For some Cassava Brown Streak Disease (CBSD) the major part affected is the tuber/root of the plant. Experts out in the field ordinarily dig up a set of plants in selected gardens and examine five cross-section cuttings of the root. A score of severity of disease is assigned to the plant based on the average percentage of necrotized root of all five cross-sections examined. We intend to automate this process so there is a uniform scoring of necrosis from a uniform system.

Necrotized roots infected with CBSD

Automating Crop Surveillance – theory vs. praxis - part 1/3


An annual crop survey is carried out in Uganda every year by the National Crop Resources Research Institute (NaCRRI) targeting the cassava plant. Normally experts are required to go out to different gardens in the four disparate regions of Uganda and fill out paper forms indicating the incidence and severity of diseases affecting cassava by visually examining cassava plants.

Automating this process is what we set out to do – the main reason being to enable policy makers access this data immediately as it is collected in the field, and secondly to make several component processes of the survey more efficient for example automated diagnosis and severity scoring, automated count of disease vectors on the cassava leaves and uniform scoring of necrotized cassava tubers.

Theoretical Solution

In brief…

Replace the paper forms with cheap mobile phones running Android OS and using the existing telecommunications network get geo-tagged surveys directly on to a map in real time from data collectors in the field.

Because of the available processing power on the phones, automate the cumbersome tasks of whitefly count and severity scoring on the phone so experts need not be the ones to carry out the survey and can use their limited time doing something else.


We built a system based on the Open Data Kit (ODK), Google AppEngine, the Google Map API and Google Fusion Tables – heck we even communicated using gmail ☺

ODK: We used ODK-build to build forms usable on the mobile devices (converting the paper form to an appropriate mobile format). ODK-Collect was used to collect the data on the mobile devices. ODK is presently only compatible with devices running Android OS.
Google AppEngine (GAE): Used GAE to host the server side of the system that receives the uploaded data from the mobile phones. It was implemented as an instance of ODK-Aggregate.

Google Map API: We integrated the API with GAE to display the uploaded data on a dynamic map.
Fusion Tables:
Used to seamlessly interface data in ODK Aggregate with the Map API.
A real-time map of the crop survey

Thursday, 6 September 2012

Second in video seminar series

The second in our series of live video seminars took place today, with Allan Tucker from Brunel University giving a talk on probabilistic analysis methods for some genetic, clinical and ecological datasets.

Last month, Charles Fox from the University of Sheffield presented some work on traffic flow modelling using UK number plate recognition data. The series gives us a way to hear presentations from AI and machine learning researchers around the world without the obstacle of them having to physically travel to Uganda.

The list of upcoming seminars is here.