1 2 3 4 5 6 7 8 9 10 11 12

Using ConvNetSharp With Feature Based Data

21 Apr, 2018

ConvNetSharp which is descended from ConvNetJs is a library which enables you to use Neural Networks in .NET without the need to call out to other languages or services.

ConvNetSharp also has GPU support which makes it a good option for training networks.

Since much of the interest (and as a result the guides) around Neural Networks focuses on their utility in image analysis, it's slightly unclear how to apply these libraries to numeric and categorical features you may be used to using for SVMs or other machine learning methods.

The aim of this blog post is to note how to acheive this.

Let's take the example of some data observed in a scientific experiment. Perhaps we are trying to predict which snails make good racing snails.

Our data set looks like this:

Age   Stalk Height    Shell Diameter    Shell Color   Good Snail?
1     0.52            7.6               Light Brown   No
1.2   0.74            6.75              Brown         Yes
1.16  0.73            7.01              Grey          Yes
etc...

ConvNetSharp uses the concept of Volumes to deal with input and classification data. A Volume is a 4 dimensional shape containing data.

The 4 dimensions are:

...

Alpha Release of PdfPig

17 Jan, 2018

I'm very pleased to finally have reached the first alpha release of PdfPig (NuGet).

PdfPig (GitHub) is a library that reads text content from PDFs in C#. This will help users extract and index text from a PDF file using C#.

The current version of the library provides access to the text and text positions in PDF documents.

Motivation

The library began as an effort to port PDFBox from Java to C# in order to provide a native open-source solution for reading PDFs with C#. PdfPig is Apache 2.0 licensed and therefore avoids questionably (i.e. not at all) 'open-source' copyleft viral licenses.

I had been using the PDFBox library through IKVM and started the project to investigate the effort required to make the PDFBox work natively with C#.

In order to understand the specification better I rewrote quite a few parts of the code resulting in many more bugs and fewer features than the original code.

As the alpha is (hopefully) used and issues are reported I will refine the initial public API. I can't forsee the API expanding much beyond its current surface area for the first proper release.

...

Configuring SonarQube with GitLab and TeamCity

05 Dec, 2017

Introducing static analysis to a project can help inform code reviews and highlight areas of the code likely to cause errors as well as expose trends in code quality over time. The tradeoff is that there are often many false positives in a report which need to be investigated.

When I configured SonarQube (6.4) to provide static analysis for our C# project we struggled to incorporate it into our normal development process since it sat outside the usual branch -> build -> merge request workflow.

For our source control we were using GitLab (10.1.4) and our build server was running TeamCity (2017.1).

Get the plugin

Gabriel Allaigre has written the sonar-gitlab plugin which enables SonarQube to push its analysis results to GitLab. This presents the results of analysis in the same place we review our merge requests as well as causing build errors when violations occur; and therefore helps incorporate SonarQube into the development workflow.

First you will need to install the sonar-gitlab plugin to your SonarQube environment and follow the steps detailed in the configuration section of the readme:

  1. Set the GitLab URL from the Administration -> Configuration -> General Settings -> GitLab
  2. Set the GitLab user token in the same place. This should be a token for a GitLab user with the developer role. You can get this token in GitLab by going to Profile -> Edit Profile -> Access Tokens and generating a new access token.

Once this is installed the SonarQube configuration is complete.

Configure TeamCity

The installation guide for the sonar-gitlab plugin describes how to configure it when using the GitLab CI or Maven for builds. To run the analysis from TeamCity we need to get some additional information for the parameters to the command line.

If we were running from GitLab's CI we would use the following command to start the Sonar MSBuild Scanner, pushing to GitLab after the analysis completed:

...

Visual Studio 2017 Red Underline/Incorrect Highlights

30 Nov, 2017

There are a lot of answers on this topic but in order to aggregate the steps I usually follow for future reference I'm noting them in this blog post.

There are few things more annoying than Intellisense going wobbly and flagging successfully compiling code with errors. Obviously the first step is to restart Visual Studio but if the problem persists you need to try something more. These steps are for Visual Studio 2017 Community with Resharper.

You can check whether the underlines have disappeared after each step or run them all:

  1. Unload then reload the problematic project from Solution Explorer. To do this, right click the project, select Unload Project and then Reload Project. This sometimes helps clear incorrect highlighting due to Resharper especially after merges.
  2. Clear the Resharper cache. This is accessed by going to Resharper > Options > General > Clear caches in the menu. You will need to restart Visual Studio to see if this step worked.
  3. Disable Resharper from Tools > Options > Resharper > Suspend Now. Then start it again from the same location.
  4. Close Visual Studio and then delete the .vs folder from the source folder. This is a hidden folder at the same level as the .sln file.
  5. With Visual Studio closed delete the obj folders from the problematic project folders.
...

The Curious Case of the Null StringBuilder

21 Aug, 2017

Today was spent tracking down a very weird bug. In production we were seeing an important part of our document reading fail. We kept getting NullReferenceExceptions when calling AppendLine on a non-null StringBuilder. It didn't prevent us reading the document however the result would be significantly different to the same document on a local instance of our program.

It only started occurring after the production server had been running for a few days which meant we couldn't debug it locally. We were running .NET 4.5.2.

Luckily we had lots of logging to track down the issue. The problem was in a class like this:

public class AlgorithmLogicLogger
{
    private readonly StringBuilder stringBuilder;

    public AlgorithmLogicLogger()
    {
        stringBuilder = new StringBuilder();
    }

    public void Append(string s)
    {
        var message = BuildStringDetails(s);

        stringBuilder.AppendLine(message);
    }

    private static string BuildStringDetails(string s)
    {
        return $"{DateTime.UtcNow}: {s}";
    }
}

This was a class which was originally intended to provide detailed logging for a complicated algorithm.

The call to StringBuilder.AppendLine() inside Append was throwing a NullReferenceException.

After ensuring no weird reflection was taking place and using Ildasm to inspect the compiled code we were sure it wasn't possible for stringBuilder to be null. It was always instantiated in the constructor and never changed elsewhere.

The next working theory was that a multi-threading issue was somehow calling Append prior to the field being set. This was also discounted both because it wouldn't have been possible and also because the code in question was not called from multiple threads.

After ensuring that it wasn't the case that the garbage collector wasn't somehow incorrectly collecting the string builder (because it was only written to, never read, the reading hadn't been implemented yet) we were beginning to run out of ideas.

...
1 2 3 4 5 6 7 8 9 10 11 12