by Matt Leetz
I’ve been working in the Critical Food Studies Lab for the past two semesters, or my entire senior year. Over this time, I’ve been building a mathematical spreadsheet model to analyze the presence and intensities of Food Desert regions in the Chicagoland Metropolitan Statistical Area, which covers the area of Chicago as well as its suburbs expanding into Indiana and Wisconsin. I’m interested in utilizing geographic problem solving methods and applying them to human geography, in the hope that we can better understand how these problems come to be and aid in the quest towards food security for all Americans. The end product of this project will be an interactive data model overlaid on a map of the area that highlights where food insecurity is most intense and provide qualitative information about the food vendors that do exist.
Last semester was spent collecting data to put into the model. There are five matrices involved in the Maximal Covering Location Problem. We are using the MCLP, introduced by Church and ReVelle in 1974, to solve a discrete (accessible by the network of roads) problem of coverage by locating food facilities within a maximum allowed distance to customers. We set this as an arbitrary drive-time of 20 minutes, which we can increase or decrease as our independent variable. The first matrix is a distance matrix, where we have calculated drive times between all of the Census blocks in the Chicagoland MSA. This by far took the most time to prepare, as the workflow handed down to me involved manually querying each address, receiving a JSON file and extracting information from that into the format we needed. Luckily, we were able to use ArcMap’s Network Analyst function instead, which cut the workload from ~400 hours to 20 minutes. Next comes the population matrix. I was able to collect population data from the US Census Bureau in the form of shapefiles, a filetype used by Geographic Information Software such as ArcMap and QGIS. I was also able to find population centroid data, or the geographically weighted mean of the population of each Census block, and apply it to the shapefiles.
With those two out of the way, the other matrices needed are the Constraint matrix (maximum allowed drive time), the Assignment matrix (a way of applying the many internal constraints necessary for the model to function), and the Output matrix (a blank matrix growing impatient to be filled). These were easy to create in Excel this semester, and I have since been cleaning and perfecting the model. This model has nearly 2200 constraints (which I was able to create through a Python script) and 4.7 million variables, making it extremely taxing on computers. We have a specialized computer in the Food Institute now dedicated to running this model as it was built, but at its current rate will take 3-5 months to finish the process. I am now attacking this problem from an alternative route: using Matlab code to perform this process through Mixed Integer Non-Linear Programming, which we can then run on the IU supercomputing cluster through remote access.