Empty strings, resource fallback and spreadsheet import

In version 4.53.0005 of the Multi-Language Add-In for Visual Studio 2008/2005, I have made a change to the handling of empty strings during the import from an Excel or OpenDocument spreadsheet.

In the new version, empty strings in a spreadsheet are interpreted as no translation rather than an empty translation. This is a fairly subtle distinction, which deserves some explanation. What do I mean by no translation and an empty translation?

Before getting into that, it's important to understand the resource fallback mechanism.

Resource Fallback

The resource manager in a .NET application will load resources using a three level hierarchy:

1The specific language. The specific language used by the application, for example French(France).
2The neutral languageThe corresponding neutral language, for example French.
3The original languageThe non localized resources compiled into the application.

If there is no resource for the specific language, then it will automatically look for (or fall back to) a resource in the neutral language. If there is no resource for the neutral language, it will use the non localized resource.

In practice, there will be very little difference between the specific language and the neutral language. If you bother with the specific language at all, then you only need to define those texts which are actually different from the neutral language.

The Multi-Language Add-In shows texts which result from resource fallback in light grey text. In the following screenshot, only the text "Passt schon!" is defined for German(Austria). The other texts for German(Austria) are inherited from German.
Image
(Please don't think that "Passt schon!" is a sensible translation of OK to Austrian German. It is not.)

"No translation" or "emtpy translation"

We can now come back to the terms which I introduced at the start.

No translation

After describing resource fallback, it should be fairly clear what I mean by no translation. For a given language, there is simply no translation, which means that resource fallback will be used to find the text.

Empty translation

An empty translation is simply a text of zero length.

A text of zero length will be stored as a resource string of zero length. Remember that resource fallback occurs when there is no resource at all, not when the string is empty.

Empty translations are bad practice

Why would you ever translate want an emtpy translation?

The most likely case is that you have created a sentence by concatenating texts, for example:

German"Datei " & FileName & " kann nicht geöffnet werden"
English"Can't open file " & FileName

To localize this code (from German to English) you might translate "Datei " with "Can't open file " and " kann nicht geöffnet werden" with an emtpy string.

The reason that this is bad practice is, that you can't possibly give these texts to a translator. No translator would ever translate "Datei " with "Can't open file " let alone " kann nicht geöffnet werden" with an emtpy string.

In addition, the texts contain leading and trailing spaces which have to be preserved. Again, you can't reasonably expect a translator to respect these details.

Instead of building a sentence using string concatenation, you should use the String.Format() method, and define the text as full sentence using placeholders for variables. For example:

GermanString.Format ( "Datei {0} kann nicht geöffnet werden", FileName )
EnglishString.Format ( "Can't open file {0}", FileName )

Of course, you will need to explain this to a translator, but then you can expect a correct translation.

Best practiceUse full sentences with placeholders for variables.

Never build sentences with string concatenation.

Why would anyone use string concatenation?

Building sentences with string concatenation is quite common in VB6, because there is no equivalent of the String.Format() method. Programmers who have moved to VB.NET from VB6 may have stuck with this habit.

If you are still programming in VB6, you can use the following simple function, which substitutes the placeholders %0 to %9 with the function parameters.

Substitute Function for VB6
'MsgBox Substitute("Do you want to delete the file %0",FileName)
Public Function Substitute(ByVal FormatString As String, ParamArray Params() As Variant) As String
  Dim Index As Long
  For Index = LBound(Params) To UBound(Params)
    FormatString = Replace(FormatString, "%" & Index, Params(Index))
  Next
  Substitute = FormatString
End Function

(By the way, the VB6 Add-In contains a Substitute Assistant in the tools menu, which can automatically convert string concatenation into calls to a Substitute function. This feature has never got past the beta stage and has not made it into the .NET Add-In, but it might arrive at some time in the future.)

Before you dismiss string concatenation as a bad habit of VB6 programmers, lets take a look at the standard output stream cout in C++.

String concatenation in C++
cout << "Datei " << FileName << " kann nicht geöffnet werden" << endl ;

From a localization standpoint, this is just as bad as string concatenation in VB6.

Empty strings in spreadsheets

After all this background information, we can come back to the handling of empty strings during the Excel or OpenDocument spreadsheet import.

Older versions had no special treatment for empty strings. They were simply imported into the project database. This means that an empty string in the spreadsheet was interpreted as an empty translation.

Starting with version 4.53.0005, empty strings are interpreted as no translation. This means:

  • the empty string is not imported into the project database and furthermore
  • if there is a translation in the project database, then it will be removed.

This will cause the text to be determined via the resource fallback mechanism.

Why the change?

Finally, you might wonder, why I made this change.

The reason has more to do with export than with import. The older version effectively used the resource fallback logic during the export. As a result, missing translations in a given language were filled up with the original (non localized) texts.

One user suggested that it would be better to leave the cells for missing translations empty, so that you can immediately see that the translations are missing.

This seemed sensible, but on looking closer I realised that I would have to make a change to import logic, to prevent empty cells in the spreadsheet being imported as empty translations.

Previously, I think that I was more sympathetic to the idea of empty translations, and thought that empty translations had to be a legal option. As I have explained above, I am now of the opinion that empty translations are always bad practice.