Tech – Page 44 – Andy Balaam's Blog

Checking the case of a filename on Windows

Windows generally uses a case-insensitive but not case-preserving file system.

When writing some code that is intended to be used on Linux as well as Windows, I wanted it to fail on Windows in the same cases that it would fail on Linux, and this meant detecting when the case of a filename differed from its canonical case on the file system.

I want to ask “is this file name correct in terms of case?”

I was working in Java, but I think this issue would be similar in other languages: it’s difficult to ask for the canonical case version of a file name when we currently have a filename with abitrary case.

The only solution I came up with was to list the contents of the parent directory and check whether my arbitrary filename is listed with the correct case in the results:

// CaseCheck.java

import java.util.Arrays;
import java.io.File;
import java.io.IOException;

class CaseCheck
{
    private static File parentFile( File f )
    {
        File ret = f.getParentFile();
        if ( ret == null )
        {
            ret = new File( "." );
        }
        return ret;
    }

    private static boolean existsAndCaseCorrect( String fileName )
    {
        File f = new File( fileName );
        return Arrays.asList( parentFile( f ).list() ).contains( f.getName() );
    }

    public static void main( String[] args ) throws IOException
    {
        System.out.println( existsAndCaseCorrect( args[0] ) );
    }
}

Checking it on its own source file:

javac CaseCheck.java && java CaseCheck cASEcheck.java
false

javac CaseCheck.java && java CaseCheck CaseCheck.java
true

It seems to work.

Note that this also returns false if the file doesn’t exist, and will throw an error if the file name specifies a parent directory that doesn’t exist.

Passing several values through a pipe in bash

I have been fiddling with some git-related shell scripts, and decided to try and follow the same approach as git in their structure. This means using the Unix system where each piece of functionality is a separate script (or executable) that communicates by using command-line arguments, reading from the standard input stream, and writing output to the standard output stream.

This allows each piece of functionality to be written in any programming or scripting language. In git’s case this has allowed initial versions to be written in bash or perl and later optimised versions (sometimes written in C) to be dropped in, piece by piece. It’s an incredibily flexible way of working and can also be very efficient.

Most of my prototyping has been in bash, and I’ve found sometimes I need to write out multiple values from a script and collect them as input in another script.

Writing the output is simple:

#!/bin/bash

# outputter.bash

# Imagine A, B and C have been created by some complex process:
A="foo bar"
B="  bar"
C="baz   "

# At the end of our script we simply write them out on separate lines in a known order
echo "${A}"
echo "${B}"
echo "${C}"

But reading them in somewhere else gave me some trouble until I learned this recipe:

#!/bin/bash

# inputter.bash

# Read in the values one per line:
IFS=$'\n' read A
IFS=$'\n' read B
IFS=$'\n' read C

# Now we can use them.
echo "A='${A}'"
echo "B='${B}'"
echo "C='${C}'"

And now the values transfer succesfully, preserving whitespace:

$ ./outputter.bash | ./inputter.bash 
A='foo bar'
B='  bar'
C='baz   '

The recipe uses bash’s built-in read command to populate the variables, but sets the IFS variable (Internal Field Separator) to a newline, meaning all the whitespace in the line is treated as part of the value to be read. The $'\n' syntax is a literal newline.

How to use git (the basics)

Series: Why git?, Basics, Branches, Merging, Remotes

Git is a very powerful tool, but somewhat intimidating at first. I will be making some videos working through how to use it step by step.

First, we look at how to track your own code on your own computer, and then get a brief look at a killer feature: stash, which lets you pause what you were doing and come back to it later.

Slides: How to use git (the basics) slides.

Using gnome-mplayer to play DVB radio without asking whether you want to resume

When I launch gnome-mplayer to play back radio over my TV card (DVB), it asks me whether I want to resume from where I left off, which doesn’t make sense for this kind of stream.

I couldn’t find a way to switch this off, but a little hacking with gnome-mplayer’s sqlite database does the trick.

Here’s my Radio 4 launch script:

#!/bin/bash

URI="dvb://BBC Radio 4"

sqlite3 ~/.config/gnome-mplayer/gnome-mplayer.db "DELETE FROM media_entries WHERE uri='${URI}'"

gnome-mplayer "${URI}"

C++14 “Terse” Templates – an argument against the proposed syntax

Today I attended two excellent talks by Bjarne Stroustrup at the ACCU Conference 2013. The first was an inspiring explanation of the recent C++11 standard, and the second, “C++14 Early thoughts” was an exciting description of some of the features that might go into the next standard.

One of those features, which Bjarne called “Terse” Templates, might be a good idea, but the syntax Bjarne proposed seems like a bad idea to me, because it leaks unwanted names into the namespace containing the function you are writing.

Allow me to explain.

Background – Concepts Lite

I attended another excellent talk before Bjarne’s, called “Concepts Lite-Constraining Templates with Predicates” by Andrew Sutton, introducing “Concepts Lite”, which is an attempt to salvage a manageable language feature from the very large “Concepts” feature that failed to make it into C++11.

My (so far very basic) understanding of Concepts Lite is that it is a way of defining conditions that state whether a template will be expanded for a given type.

So, in C++11 (and C++98), we can declare a (stupid) template function like so:

template<typename ListOfInt>
int first( ListOfInt& list ) { return list.size() > 0 ? list[0] : 0; }

The code in this function template assumes that list has a size method, and an operator[] method. We tried to “suggest” this, by naming our template parameter ListOfInt, but the poor programmer may not realise exactly what we meant.

If we do the wrong thing, and try to use the first function with an int argument:

int i = 3;
first( i );

It goes wrong, because ints don’t have a size method:

In function 'int first(ListOfInt&) [with ListOfInt = int]':
error: subscripted value is neither array nor pointer
error: request for member 'size' in 'list', which is of non-class type 'int'

This error is not too obscure, but in complex cases the errors can be extremely long, and point to problems that appear to be unrelated to the code we are writing.

Really what we want to know is that int is not a ListOfInt.

Concepts Lite give us the ability to define what a ListOfInt means, and only expand the template for types that match that definition.

In our example we would do something like this:

template<typename ListOfInt> requires SizeAndIndex<ListOfInt>()
int first( ListOfInt& list ) { return list.size() > 0 ? list[0] : 0; }

(There is actually a neater syntax, but we’ll do it like this for now because we need the more verbose form later.)

What this means is that this template function will only be expanded for types that satisfy the constraint.

The definition of SizeAndIndex is outside the scope of this article – it allows us to check whether types satisfy some conditions. In this case we assume it checks that the type contains the methods we use.

Now when we do the wrong thing:

int i = 3;
first( i );

We get a simple error message, that properly tells us what’s wrong:

error: no matching call to â€˜first(int list)â€™
note: candidate is â€˜first(ListOfInt& list)â€™
note: where ListOfInt = int
note: template constraints not satisfied
note: â€˜ListOfIntâ€™ is not a/an â€˜SizeAndIndexâ€™ type since
note: â€˜list.size()â€™ is not valid syntax

(The above is fiction, but Andrew assures us he gets real errors like this with his prototype.)

So Concepts Lite gives us the optional ability to check that our template parameters are what we expected them to be, giving a decent error message, instead of waiting for something to fail much later when we compile the instantiated template.

So far so utterly cool. (And, in my ill-informed opinion, the only bit of Concepts I really wanted anyway.)

There’s more information on this feature here: Concepts Lite: Constraining Templates with Predicates and here: Concepts-Lite.

Constraints on multiple types

The Concepts Lite feature as proposed allows us to specify constraints that describe how multiple types relate to each other, by doing something like this:

template<typename Victim1, typename Victim2> requires Lakosable<Victim1, Victim2>
void lakos( Victim1 a, Victim2 b );

Here the Lakosable constraint can specify conditions that describe how the two types relate to each other, for example that Victim1::value_type is equal to the type of Victim2.

This is very good.

Now, the bit I want to argue against.

“Terse” Templates – the syntax I don’t like

Bjarne gave us an example of the std::merge function, which has lots of arguments, and very complex constraints on them. He showed us that these could all be nicely wrapped into a single Mergeable constraint (similar to the Lakosable constraint above) but he argued that there was still too much repetition. The repetition comes from the fact that several functions in the standard library have the exact same template parameters, with the exact same constraints on them, and that you have to mention the whole list of template parameters twice: once after the template keyword, and once in the requires condition.

This led him to look for a terser syntax.

So, he proposed a modest new construct that looks like this:

using Lakosable{Victim1,Victim2}; // (1)

that allows a radical departure from everything that has gone before in terms of declaring templates. After we’ve made the declaration (1), we can declare the exact function we declared above with this little line:

void lakos( Victim1 a, Victim2 b ); // (2)

The using declaration in (1) makes the names Victim1 and Victim2 available in the current namespace, and gives them special powers that mean functions taking parameters of type Victim1 or Victim2 are automatically function templates, even though the template keyword is nowhere to be seen.

There was some resistance in the room to this proposal. Most of it focussed on (2), and the fact that templates were being declared without it being visible because of the lack of the template keyword.

I’m actually ok with (2). In fact, my ficticious programming language Pepper (which represents everything I think is Right in programming languages) provides a feature very much like this – all non-definite parameter types act as “implicit” templates in Pepper (see “implicit_templates.pepper” on the Examples page).

Bjarne made a reasonable defence of (2), arguing that we often want new features to be “signposted” by new keywords (he cited user-defined types as an example – apparently some people wanted to require “class MyClass” instead of just “MyClass” every time we referred to a user-defined type) but later when they are familiar we want less verbose syntax. (Presumably the “new” feature he was talking about here is templates.)

My problem is with (1).

As my neighbour in the talk (whose name I missed, sorry) pointed out, what (1) does is dump 2 new names Victim1 and Victim2 in the namespace containing the lakos function template.

No-one wants these names.

In fact, why are we doing any of this?

The sole purpose of the exercise is to constrain the lakos function template. Why is the result putting 2 names into the namespace?

More seriously, in the case of the standard library, these names will go into the std:: namespace, and there could easily be clashes. If the std::merge function uses the name For for one of its template parameters (a Forward_iterator), and std::copy wants to use one with the same name, but with different constraints, it will override the definition of For.

I.e. If we do this:

namespace std {
using Mergeable{For,For2,Out};
// define std::merge
}

// and somewhere else:

namespace std {
using Copyable{For,Out};
// define std::copy
}

then the (useless) value of std::For will be different depending on the order in which we import the header files.

I Think

I think.

Please correct me if I’m wrong.

Conclusion

If I’m right, this all seems bad and Wrong.

What was wrong with:

template<typename Victim1, typename Victim2> requires Lakosable<Victim1, Victim2>
void lakos( Victim1 a, Victim2 b );

anyway?