Profile
Blog
Photos
Videos
This is a final monitoring schedule item for Engineers Without Borders, where I cover a challenge that I had during the placement, whether I overcame the challenge and why/why not?
The most significant challenge was developing the database. This was difficult because I did not have a lot of experience doing such a thing (for example the spatial databases topic in my Masters course at Unimelb involved an exam and 10 SQL query assignments of 10 items each, we never built a database as a backend to a GIS from scratch).
Another reason for the challenge was that there was only one staff member with any database training, yet he was one of the most busy at STT so it was difficult to find time to work with him. I ended up spending more time than I should have with an intern, so hopefully STT is able to employ him as a database administrator in future:-)
The placement lived up to its job description of involving a lot of wrangling with spreadsheets of data produced by STT before finally constructing a relational database from them. The record for preparing a relation goes to the mapping team, who spent a month straightening out the MIT dataset. This is a demographic survey which has been underway since 2009 as part of an infrastructure upgrading project in the most vulnerable of urban poor communities in Phnom Penh. They collected information about family size, financial situation, access to public services, access to transport and access to media such as the number of televisions in the household. All of this data was in separate spreadsheets, one per community. Most surveys seemed to have similar questions (although one of the latest ones was only about finances) so I asked the team to put all the spreadsheets into one and give each record an ID such that we could project records by community from the relation. This required them to develop their Excel skills, including using a few IF formulas and even a VLOOKUP occasionally. As usual, a bottleneck in the process was caused by a lack of English skills because we were working on the (translated into) English version of the dataset. Some fields were about type of material used to build each floor of the house. There were many different words, and spellings, for the same type of material across the spreadsheets. This type of issue occurred for multiple categorical variables in the survey and this was one of the last things I fixed up before taking them through the process of importing it into Access.
It was a satisfying day when we finally got the MIT dataset in Access, as we also finished a process of dredging up all the shapefiles of house footprints in urban poor communities of the MIT survey and merging them all. Where available, these were given the same ID as the database records, so they could be joined in the mapping program. We ended up being able to join over half of the house footprints (the rest were not covered in the survey or were amenities blocks etc) to records in the database, such that when one used the identify tool on a house one would get the relevant survey data about the families in it. It was really satisfying to be able to demonstrate to the mapping team how a database and map can work together. This dataset was put in our ArcReader file of database layers so that everyone at the organisation could look at it, although now I come to think of it I forgot to put this layer in the model that creates KML files of the layers for display in Google Earth, perhaps this is something for the research team leader to add to the model or for the next advisor to show them about.
I won't explain everything about the database as from this you get enough of a feel for what went on for the other eight relations. Was the challenge successfully overcome? On the whole, yes. We got the datasets about urban poor issues in Phnom Penh that STT maintains into one database. We got that connected to a mapping engine so we can see the data against a background of reference layers. This map is distributed to all staff members via ArcReader and Google Earth (network links are your friend!) The project, however, is not finished. Only a start was made to reports. This for the database administrator to work on, and STT will not advance as well as they might without employing one.
- comments