1 2 3 4 5 6 7 8 9 10 11 12

So you want to parse a PDF?

03 Aug, 2025

Suppose you have an appetite for tilting at windmills. Let's say you love pain. Well then why not write a PDF parser today?

The ideal world: how the specification should work

Conceptually parsing a PDF is fairly simple:

First, locate the version header comment at the start of the file
Next you need to locate the pointer to the cross-reference
Then you can find all object offsets
Finally you locate and build the trailer dictionary which points to the catalog dicitionary

Introduction to PDF objects

A PDF object wraps some valid PDF content, numbers, strings, dictionaries, etc., in an object and generation number. The content is surrounded by the obj/endobj markers, for example a simple number may have its own PDF object:

16 0 obj
620
endobj

This declares that object 16 with generation 0 contains the number 620.

A PDF file is effectively a graph of objects that may reference each other. Objects reference other objects by use of indirect references. These have the format "16 0 R" which indicates that the content should be found in object 16 (generation number 0). In this case that would point to the object 16 containing the number 620. It is up to producer applications to split file content into objects as they wish, though the specification requires that certain object types be indirect.

Finding the cross-reference offset

To avoid the need to scan the entire file, PDFs declare a cross-reference table (xref). This is an index pointing to where each object in the file lives.

Each file ends with a pointer to the cross-reference file:

<< %trailer >>
startxref
116
%%EOF

This tells the parser to jump to byte offset 116 to find the xref table (or stream). In theory this pointer is right at the end of the file, according to the specification:

...

Laureles, Medellin swimming pools

23 Apr, 2025

Just a brief note to summarise the (non-hotel) swimming pool situation in the Laureles area of Medellin since it is not easy to find/understand all this information.

There are 3 primary organisation types that are relevant:

Inder. Run by the mayorality this sports department offers access to many different sports facilities for residents and visitors alike. Swimming can be accessed for free through Inder.
Liga de Natacion de Antioquia. The swimming league of the department of Antioquia they share access to many of the same facilities with Inder; however they provide paid access to these facilities.
Private organisations. A third category that runs some pools.

It's worth noting that for all pool types the required swimwear is a swimming cap (even for bald people) and lycra (not board) shorts. There are 2 shops at Estadio that sell these, one near the Inder entrance and the other behind the stairs at the Liga entrance. There may also be a shop at Belen.

Pools

The main pools of interest are the Estadio and Belen sites. Belen has a single Olympic pool (see the private swimming section for details on a second 25m pool in Belen) and Estadio has a range of pools including Olympic, sub-aquatic and many others.

Estadio, has an Inder gate with ticket office here and a Liga gate here.
Belen, the pool is here.

Swimming with Inder

To use Inder pools with a reservation you will need to register with Simon 2.0. Simon 2.0 is the booking platform for various sports facilities including the pools at Estadio and Belen. Once registered you can book access to a pool session for free, however as of Spring 2025 there is high demand and because many pools, including the Olympic pool, at Estadio are currently being refurbished the available reservation slots fill up quickly.

Another approach is to join the queue for entry at the pool in Estadio. Each hour (or outside the hour at the discretion of the ticket office) a certain number of people are let in from the queue for free, again depending on occupancy. I don't have first-hand experience with it but in the week it apparently doesn't take too long if you go early. The non-reservation entry queue may also exist at Belen but I haven't validated this.

You will need your original id, either passport or Cedula (no photocopies), to enter with Inder.

Swimming with the Liga

The Liga de Natacion de Antioquia also operates at the Estadio and Belen locations. They don't seem to be mentioned if you ask about swimming in general at Inder locations.

...

Writing Code for Fun and ... That's It

30 Jul, 2023

Supreme Commander: Forged Alliance is a real-time strategy (RTS) game released in 2007. It is also the last good RTS, and potentially game, ever.

Despite this most gamers — not realizing gaming reached a pinnacle in 2007 and has since descended into a mess of RPGification, Loot Boxes and bloom over-use — have moved on to other games and more importantly other genres¹.

One need only look at the modern-game design horrors unleashed on the Dawn of War genre to see the decline of RTS and gaming generally more clearly. From the high-point of Dawn of War: Winter Assault in 2005 to the best-forgotten mess of Dawn of War III in 2017 everything has been ruined irrevocably. Nothing is good anymore, food doesn't taste the same, music isn't what it was². I'm not getting old, things are getting worse. I'm still cool and relevant!

Few things in gaming quite match the state of stressed, overwhelmed, misery induced by Forged Alliance. Thankfully the Forged Alliance Forever (FAF) project has continued to develop Forged Alliance, providing graphical updates, balance changes, matchmaking, performance improvements, new maps and more.

Because gamers have departed from the True Path matchmaking generally takes a while. A small active user-base means that depending on the time of day and day of week you can be waiting anywhere from 10 minutes to over an hour for a lobby (the pre-game matchmaking bit) to fill.

It is generally necessary to be ready to balance and start the lobby within a few minutes of it filling. If there's too much of a delay everyone will leave again and the waiting time will have been wasted.

Which brings me to the justification for my project, FAF Lobby Sim. While the lobby is running on my desktop I don't want to have to sit checking it every few minutes. It would be good to be able to do other things elsewhere while having a way to check on the occupancy status.

...

Blog Update

29 Jul, 2023

As this blog entered its tenth year it was desperately in need of a facelift. I had effectively stopped writing new posts because the overhead of remembering how to upload a new post each time was off-putting enough to stop me entirely.

For that reason it has been updated with new - hopefully more consistent - CSS. In addition the code and posts are deployed on each commit by GitHub actions. This means I no longer need to SSH into the server to do things by hand. This should mean the blog is editable from wherever I have git access. I also removed the terribly insecure previous method of uploading posts directly onto the server which was using a plain-text username and password.

Most importantly privacy-invading Google Analytics has been removed along with almost all JavaScript. I no longer see the need for analytics on this blog. The only remaining scripts are for code highlighting client-side as well as Disqus for commenting. Disqus is opt-in so will only load in if you click the "Load Comments" button at the bottom of each post.

Unfortunately I still rely on Google Fonts for the title font but the main body font has been updated to use the system font stack.

Finally it has been updated to .NET 7. The previous version was running .NET Core 1.1 and hadn't been updated since 2017. So much for keeping software up-to-date! The new version is on .NET 7 and the upgrade process was surprisingly easy.

While the changes between the ASP .NET versions have been a little hard to keep track of with Startup.cs being removed and everything moving into Program.cs the migration was made simple by my lazy cheat. I just created a new ASP .NET 7 application from the template in Visual Studio and moved most of the files without any changes, except to update the namespaces.

I also took the opportunity to add caching. Previously it loaded every post file from disk repeatedly just to show a single page. Now, because posts are only updated whenever the app itself is deployed and restarted, cache invalidation is trivial.

I have added the images into git too which generally causes a lot of squawking from people who use git properly (nerds), but keeping things simple should hopefully mean I fall into the pit of success. Each PNG image has been compressed further where possible.

...

Visual Studio 2022 Debugger Freezes

18 Feb, 2022

Wow, blogging with any kind of regularity is hard I kind of hard I guess. Who knew?

Anyway this was just a quick note for a problem I had recently.

I found that my Visual Studio 2022 debugger would hang, freeze or become unresponsive intermittently. Prior to the last update the process would freeze completely, however even after updating it would freeze though the IDE itself would remain responsive.

This freeze seemed to trigger more frequently when using time-travel debugging.

As usual when weird things happen the antivirus was to blame.

Excluding the process named VsDebugConsole.exe seemed to resolve at least all the debugger hangs after adding the exclusion. No doubt some more will occur but at least VS22 is usable again.

To get to exclusions on Windows 10 (old-skool) go to Windows Security -> Virus & threat protection -> Virus & threat protection settings -> Manage settings -> Exclusions -> Add or remove exclusions.

...

1 2 3 4 5 6 7 8 9 10 11 12