Afterthoughts - GC v0.32 (Bus Problem)

So we had a little GC, weight on the world little. Five people showed up at the pre-event while there were only three to four of us at the actual event depending on how you count.

We got a nice little introduction to the topic from Tuukka and made some progress on the converter. A couple of notes:

  • Parsing 600 MB XML isn't as easy as you might think
  • grep or similar works great for getting quick insights
  • Apparently combination of grep and sed works to some extent provided your XML is formatted right

Besides the data we also had access to XML schema, documentation etc. You can find links to the relevant data at our previous post. The converter repository itself contains some interesting bits.

We tried sort of test driven approach first. The problem was that there weren't any concrete examples anywhere. It was all described on a higher level. Sure, we had that 600MB XML but try extracting examples out of that... That is where filtering approaches came in. We knew station data had an id related to Jyväskylä. Using that it was possible to access some other data as the data is relational after all.

I'm thinking XSLT could have come in handy. Alternatively we could have tried to load the data to a database (SQlite or something?) and then manipulated it further. I am sure these kind of things are easy if you have any kind of routine. Maybe I should have taken more XML courses at the uni after all...

We didn't get any test cases written yet, though. With some assistance from Tuukka we did manage to find one typo (Modetype vs. ModeType) and solved on of the three issues. There are still two issues left but I can do something about those on my own.

Special thanks to Tsuri and Michael for showing up! It is always helpful to have some extra brains in the premises. I am not sure if Hemingways is actually ideal for coding intense events like this. This time it was uncharacteristically crowded. Perhaps we just got unlucky.


Post a Comment

Copyright © Geek Collision