Tech Talk | 01/08/23 | #WeTalkTech
Unit Testing in ArcGIS with Python: An Introduction
Have you ever been online shopping, on a website, booking flights or interacting with your online banking and next day the site or app changes? Some of the questions you might ask are: How did it change; how does it still work? What went on in the background to ensure it all still works? The answer is: testing. For this article I would like to illustrate one such solution, a testing framework using Python, which will prove useful for anyone who is developing ArcGIS applications.
For this post I will explain how you can leverage testing frameworks to minimise impacts on your systems, maintaining business continuity and providing benefits to all ArcGIS end users. For context I have included a practical, real-world scenario (albeit the systems might be fictitious😊), to illustrate the process. This post is not a deep dive into unit testing, it is a general overview which I hope you can get snippets of information from. There are many blogs/posts on unit testing with practical examples however I find these muddy the waters and leave a lot to be desired, specificity in relation to GIS.
Quiz: There are two ill-defined late 80’s/early 90’s song and movie references, see if you can spot them.
2. What is unit testing?
Unit Testing Scenario
Eamonn from engineering has a legacy system which relies on manually exporting .csv’s on a monthly basis and emailing them to the GIS team. These .csv’s have an address field but no x/y location information. A script has been developed for geocoding these locations and adding these addresses to a feature layer in ArcGIS Online.
As part of digital transformation in Eamonn’s organisation this legacy workflow will be upgraded resulting in changes to Eamonn's system AND ArcGIS as the downstream system, in this case ArcGIS Online. The engineering team have bought a system called SkyNet (I'd take this opportunity to warn people against procuring this AI system), and the GIS team have a backlog of changes for managing addresses from SkyNet to GIS. We want to ensure that any changes or future development do not have an effect on existing code. The question - how do we do this? The answer is unit testing.
Unit Tests are pieces of code written to test the functionality of other pieces of code. This can be a particular function in the code which does a specific operation, constructs a pandas data frame from an input .csv for example. Unit testing for any code base is important in software development processes as it helps to check if code works as intended. This has the advantages of catching bugs early in development, i.e. if there is a change in the code in response to new requirements.
3. Creating unit tests
Unit Test Module
In this example we will be utilising the Python 3 unittest library and we have a couple of Python files including:
- Config.py – Python file with a class for configuration variables
- Locations.py – main code which receives a .csv input, geocodes location and adds to a feature layer
- Locations_test.py – Unit Test Python file with a Test class and methods.
The structure of the unit test script is shown in the example below. We have a class for test and then a subclass is created with unittest.TestCase. Tests are defined by methods which represent the tests, in this case test_importCSV (Figure 4). This test case is setup to read a .csv file from the config class then pass this into the createDataFrame method in the main script (Figure 5). Following this it obtains a result and the unit testing is setup to check if this result exists. Unittest provides several assert methods to check and report for failures, in this instance we are using assertIsNotNone, checking if resulting dataframe is not none, therefore testing is it created.
Figure 5 illustrates the method in the locations.py file, within a class in this instance, which receives an input for the .csv path and constructs a pandas data frame, prints this to the console, then returns a dataframe. The test above asks the question, does this dataframe exist?
Running the Unit Test
There are multiple ways to run unittest in Python depending on your IDE or it can be run via command line. In this example I am using Visual Studio Code (VSCode) which has support for unittest framework built in. To run the test select the Test flask icon in VSCode activity bar and select run. You will see the 4 tests I have configured, with all test names starting with ‘test_’ so unittest can identify and run them. One selecting run, the tests will run and the output will be displayed in the terminal (console) along with any errors (Figure 7).
To ensure our tests are configured correctly I am going to run the test again, however this time in my configuration, I am passing in an excel .xlsx instead of a .csv. Imagine Eamonn in engineering went with the default in the new system which is an .xlsx output. If this was production, our code would fail. Testing the code means we can either implement additional logic in the code or exit gracefully and communicate this to the other system. Figure 9 provides the failing output of the test in this instance which identifies where the test fails.
Changing our Code - Test
I mentioned we are talking a real-world example and you will note I have included comments and TODOs in the locations.py file. For this I am going to make a change to the file. Our code had passed all the tests coming and the Product owner for SkyNet has come to our team and asked for some changes to be made. There is a change in field name so we need to modify our code and to test we pass in "PremisesID" into the test - lets run the test and see what happens.
The test case fails for this new name as the code we have does not have this in the dataframe for construction of the object. Running this test case will help to identify where to change the code quickly and efficiently. We can modify the implementation locations.py code and then re-run the test and this should pass. Similarly if we had of modified the locations.py code and run the test this would have also failed as the expected result and dataframe are in not in the test data.
GIS is becoming more and more intrinsically linked to other systems. The question of spatial, location-based intelligence for business requirements can be answered with GIS. With the reliance on GIS we also need to consider testing and how we test our workflows which integrate with, or are reliant on, our GIS. Implementing unit testing provides confidence that if any requirements for change come in the future, our code can be modified with new features added whilst not breaking the fundamental behaviour of the code and by proxy the wider ‘system’.
This is by no means a deep dive into the unit testing frameworks rather provides a high level overview of what can be achieved. What I would hope this example provides are small snippets of code and a structure which can be used to proactively manage change in systems.
The code snippets referenced in this article can be downloaded from GitHub here.
Quiz Answer: Skynet from Terminator series of films and locations data frame is from Billy Joel, We didn’t start the fire.
Author: Dr Andrew Bell
Dr Andrew Bell is a Solution Architect at Esri Ireland. Andrew is based in our Holywood Office, and has been with our team since 2018, first starting as a Lead Consultant.
Prior to this, Andrew worked on engagements with systems integrators on a wide range of GIS projects. Andrew has a PhD in Geography which assessed spatial analysis approaches to slope instability using GIS.
Andrews particular interests are hockey, sailing, and spin. He is also currently seeking to improve his surfing skills.