Table of Contents

ESRI Geocoder Plugin

This plugin is actually for GeoKettle, which is the geospatial version of Pentaho Data Integration. You can get it from www.geokettle.org. As of this writing, it is based on the latest version of PDI, v3.2.

This plugin will allow you to geocode addresses using ESRI's software. You can geocode using two different interfaces:

  1. SOAP interface
  2. ESRI Desktop Arc Editor license

The SOAP interface does not require you to have ESRI software installed on your desktop. It will call a web service to do the geocoding. The other interface requires ESRI software to be installed on the computer from which you are running the geocoder, and you must have arcobjects.jar (from version 9.3 of ESRI).

The reason there are two choices has to do with the way ESRI licenses their commercial geocoding. If you purchase geocodes for use with the desktop software, you cannot register to also use the SOAP (or REST, or any other web service) interface. If you purchase the web service options, you cannot geocode using the desktop software.

Also, please note that this plugin will allow you to use the free versions of ESRI's online geocoding services. These services are subject to certain licensing restrictions, so you should read the terms and conditions for using these services on ESRI's website before you use them! I am in no way responsible for your use of this plugin if you break ESRI's licensing agreements.

I will warn you that this component is somewhat slow. There are several reasons for this. One is that ESRI's version 9.3 desktop java interface does not currently support batch mode geoprocessing. So each address must be sent one at a time to the server. The free SOAP interface also does not support batching up more than 10 addresses at a time, and I do not have access to the commercial SOAP interface to test larger batches, so, 10 addresses per webservice call is the limit. In my testing, each interface generally gets about 9 records per second throughput. When using the ArcGIS desktop software to geocode directly, the speed is more like 25 records per second. Perhaps future ESRI versions will add support for batching.

Another item to note is that there is a memory leak somewhere when using the desktop interface. I am pretty sure that it is not within the component as I have used the Heap Analysis Tool to analyze a running Geokettle process, and evidence points to arcobjects as the culprit, for which there is nothing I can do about that. Arcobjects makes a lot of native calls, and I think some memory is not being released. So be advised that you may have a process crash when geocoding lots of records (> 100,000) using the desktop interface.

Download

ESRIMultiGeocoderPlugin.zip

Installation

Close Geokettle if it is open.

Download the zip file, and unpack it into the ${geokettlehome}/plugins/steps folder. You can name the folder whatever you want, but I recommend naming it ESRIMultiGeocoderPlugin. If you plan to use the Desktop interface, then you must copy arcobjects.jar from your ESRI installation into the plugin directory. This should be located in Program Files/ArcGIS/java/lib/. Be sure to copy, not move!

Reopen Geokettle. You will see the ESRI Multi Geocoder Plugin in the Geospatial folder of available transformation steps.

Documentation

Plugin Configuration

Result Fields

Not all of these fields will be available, depending on the interface (SOAP or Desktop) and the address locator you choose. See ESRI's online documentation for the locator you are using for more information.