1 2 3 4 5 6 7 8 9 10 11 12

Open and Create PNG Images in C#


I wanted a platform independent way to open and create PNG images in C#. BigGustave is a new library which provides a .NET Standard 2.0 compatible way of opening and creating PNG images.

To open a png image you can pass either the bytes or the stream of the image to Png.Open and then retrieve the values for pixels at any location in the image:

Png png = Png.Open(File.ReadAllBytes(@"C:\pictures\example.png"));
Pixel first = png.GetPixel(0, 0);
Console.WriteLine($"R: {first.R}, G: {first.G}, B: {first.B}");

To create a .png image in C# use the PngBuilder to define pixel values before saving to an output stream:

var builder = PngBuilder.Create(2, 2, false);

var red = new Pixel(255, 0, 0);

builder.SetPixel(red, 0, 0);
builder.SetPixel(red, 1, 1);

using (var memory = new MemoryStream())

    return memory.ToArray();

BigGustave is completely open source and is available on NuGet now so if you need very basic PNG manipulation tools for platform independent .NET code why not check it out?


PdfPig Version 0.0.5


Today version 0.0.5 of PdfPig was released. This is the first version which includes the ability to create PDF documents in C#.

There aren't many fully open source options around for both reading and writing PDF documents so the addition of PDF document creation to PdfPig is an exciting next step for the API.

The actual design of document creation isn't finished yet and there's more work to be done around the currently unsupported use cases such as splitting, merging and editing existing documents as well as adding non-ASCII text, working with forms and adding images to new documents but the functionality in 0.0.5 should provide enough for simple use cases and the open source Apache 2.0 license means that it can be used in commercial software.

You can create a new document using a document builder:

PdfDocumentBuilder builder = new PdfDocumentBuilder();

This creates a completely empty document. To add the first page we use the imaginatively named add page method.

PdfPageBuilder page = builder.AddPage(PageSize.A4);

This supports various page sizes defined by the PageSize enum, such as the North American standard PageSize.Letter. It also allows the choice of portrait (default) or landscape pages.

Once a page builder has been created text, lines and rectangles can be added to it.


In order to draw text a font must be chosen. Version 0.0.5 supports TrueType fonts as well as the 14 default fonts detailed in the PDF Specification. These are called the Standard 14 fonts and while their use is beginning to be phased out, all PDF readers should still support them.


Sentence Boundary Detection in C#


Sentence Boundary Detection or Segmentation is the task of splitting an input passage of text into individual sentences. Since the period '.' character may be used in numbers, ellipses or names it's not enough to simply split by the period character.

When I was researching ways to do this in C# I didn't find much in the way of properly open source libraries. A lot of the libraries I found for other languages referred to the Golden Rule Set (GRS). This set comes from Pragmatic Segmenter, a Ruby gem to segment text based on rules observed from a varied corpus of text.

Since I find porting code from other languages helps me understand both the variations in how different languages approach the same problems and also how other people make architectural decisions and structure their code I decided to port it to C#.

This Pragmatic Segmenter port is available to download from NuGet. The public API is similar to that for the Ruby package however the method is static:

var result = Segmenter.Segment("There it is! I found it.");

Assert.Equal(new[] { "There it is!", "I found it." }, result);

There is also support for other languages, the Language enum gives the supported languages:

var result = Segmenter.Segment("Salve Sig.ra Mengoni! Come sta oggi?", Language.Italian);
Assert.Equal(new[] { "Salve Sig.ra Mengoni!", "Come sta oggi?" }, result);

The source code also contains a set of data from various sources I was using to test my port as well as add some behaviour for the sources I was primarily interested in (academic journals). This data can be found here. Hopefully this corpus of annotated sentence boundary data will be useful to people building their own libraries.


Using ConvNetSharp With Feature Based Data


ConvNetSharp which is descended from ConvNetJs is a library which enables you to use Neural Networks in .NET without the need to call out to other languages or services.

ConvNetSharp also has GPU support which makes it a good option for training networks.

Since much of the interest (and as a result the guides) around Neural Networks focuses on their utility in image analysis, it's slightly unclear how to apply these libraries to numeric and categorical features you may be used to using for SVMs or other machine learning methods.

The aim of this blog post is to note how to acheive this.

Let's take the example of some data observed in a scientific experiment. Perhaps we are trying to predict which snails make good racing snails.

Our data set looks like this:

Age   Stalk Height    Shell Diameter    Shell Color   Good Snail?
1     0.52            7.6               Light Brown   No
1.2   0.74            6.75              Brown         Yes
1.16  0.73            7.01              Grey          Yes

ConvNetSharp uses the concept of Volumes to deal with input and classification data. A Volume is a 4 dimensional shape containing data.

The 4 dimensions are:


Alpha Release of PdfPig


I'm very pleased to finally have reached the first alpha release of PdfPig (NuGet).

PdfPig (GitHub) is a library that reads text content from PDFs in C#. This will help users extract and index text from a PDF file using C#.

The current version of the library provides access to the text and text positions in PDF documents.


The library began as an effort to port PDFBox from Java to C# in order to provide a native open-source solution for reading PDFs with C#. PdfPig is Apache 2.0 licensed and therefore avoids questionably (i.e. not at all) 'open-source' copyleft viral licenses.

I had been using the PDFBox library through IKVM and started the project to investigate the effort required to make the PDFBox work natively with C#.

In order to understand the specification better I rewrote quite a few parts of the code resulting in many more bugs and fewer features than the original code.

As the alpha is (hopefully) used and issues are reported I will refine the initial public API. I can't forsee the API expanding much beyond its current surface area for the first proper release.

1 2 3 4 5 6 7 8 9 10 11 12