Archive

Posts Tagged ‘C’

C Strings

March 1st, 2010 No comments

First and most important of all regarding C strings is this: there is no such thing as C strings.

OK, now that that’s out of the way we can move on. The way to work with strings in C is to treat them as simple sequences of characters, a.k.a. a string in C is stored as an array of char’s. To know where the string ends, it must be terminated by a special marker, which is the ASCII character with the code 0 (commonly known as ‘\0′) – this is why they’re called zero-terminated C strings (we’ll see later that there are alternatives to this). Let’s see an example:

char myString[100];

myString[0] = 'h';
myString[1] = 'e';
myString[2] = 'l';
myString[3] = 'l';
myString[4] = 'o';
myString[5] = '\0'; /* don't forget the marker */

If you want to do the memory allocation stuff by hand, you would just change the declaration for myString to this:

char *myString = NULL;
myString = (char *)malloc(100 * sizeof(char));

Just creating a string and putting some characters inside isn’t such a big deal. You have to be able to do all sorts of stuff with it for this to be useful, things like copy them around, splitting them up (getting sub-strings from a bigger string), putting them back together (concatenating two strings into a single one), searching for things inside them. Fortunately, the creators of the C standard library thought of us and provided functions that do all of these things and more. All you have to do to use them is:

#include <string.h>

We’ll look in detail at some of these functions and, in the spirit of learning by doing, we’ll also try to provide our own implementations for them. Read more…

Newline

February 23rd, 2010 No comments

As in so many other things, the newline (or line break or end-of-line or EOL or however you call it) is something we couldn’t agree on from the beginning so we ended up having a lot of different flavors of the same thing.

The idea is simple: the newline character or group of characters say that the very next character after it should appear on a new line, immediately following the current line. The problem is that the character(s) that represent a newline vary widely across operating systems and even different versions of the same system.

The most common forms use one or two characters to encode a newline and among these the best known version is the ASCII one (or ones, as different systems based on ASCII use different versions).
These ASCII flavors use one or both of these two characters:

  • CR (carriage return, 0X0D, usually expressed as ‘r’)
  • LF (line feed, 0X0A, usually expressed as ‘n’ in programming languages)

Example of systems that use these are:

  • CR – older versions of Mac OS
  • LF – Unix, GNU/Linux, FreeBSD, Mac OSX
  • CR+LF – Windows

If you’re using Unicode, there are also Unicode versions of these:

  • CR – U+000D
  • LF – U+000A
  • CR+LF – U+000D U+000A

OK, so why should we care about all these different notations for the same thing? If we’re developing for a single platform, probably we don’t need to care much. But seeing how the Internet becomes one big computer, the situations where you develop for one system and can be absolutely sure you will not interact with anybody else become more and more rare.

So why don’t we care if we’re developing for a single platform? Because the good people who worked on the C standard thought of this. C provides two escape sequences that represent the two codes from above. These are ‘n’ (newline) and ‘r’ (carriage return). The probably unexpected thing about these is that they’re not required to conform to the ASCII values. The only things required by the standard are:

  • each of these has a unique value that fits inside a char, but the actual value is implementation defined;
  • when writing to a text file, the newline character (‘n’) is transformed transparently to the system’s character (or character group) for newline.

What this last point means is that if you take the same piece of code that writes to a text file separating lines by ‘n’ and compile and run it on Windows and Linux for example, the two output files will be different. On Windows you will get CR+LF and on Linux just LF separating the lines.
This implies that if you’re not carefull when reading such files and write code that depends on the actual character values of the newline you will run into trouble when moving files from one system to another.

Categories: Programming Tags: , , ,

DECIMAL columns with ODBC

September 16th, 2009 No comments

I had to debug a piece of code recently that in short tried to run a query against a database and get the results through ODBC. The problem was that the DECIMAL columns were reported as being NULL, even though they actually had valid values. The procedure was simple and classic:

  • bind the column
  • set the properties (precision, scale)
  • fetch the data

In code it looked something like:

SQLBindCol( statement, columnIndex, SQL_C_NUMERIC, buffer, bufferLen, indicator );

// set the attributes
SQLHDESC desc;
SQLGetStmtAttr( statement, SQL_ATTR_APP_PARAM_DESC, desc, 0, 0 );
SQLSetDescField( desc, columnIndex, SQL_DESC_TYPE, SQL_C_NUMERIC, 0 );
SQLSetDescField( desc, columnIndex, SQL_DESC_PRECISION, precision, 0 );
SQLSetDescField( desc, columnIndex, SQL_DESC_SCALE, scale, 0 );
SQLSetDescField( desc, columnIndex, SQL_DESC_DATA_PTR, buffer, 0 );

(Error checking and other stuff removed, all variables have proper types and values)

What this lead to is that after the fetch the buffer had the right value in it but the indicator parameter was -1, saying the column is NULL.

After about half a day of banging my head against all the walls I could find trying to figure out what was wrong with this, it finally hit me: there was one more thing I had to set (actually re-set – it was already done in the call to SQLBindCol(), or so I thought). Yep, the indicator field needs to be set again:

SQLSetDescField( desc, columnIndex, SQL_DESC_INDICATOR_PTR, indicator, 0 );
SQLSetDescField( desc, columnIndex, SQL_DESC_OCTET_LENGTH_PTR, indicator, 0 );

Once this was added, the indicator started to get the right value after the fetch – problem fixed.

Maybe I’m an idiot who doesn’t know shit so I’m asking: should I have seen or implied this from the documentation? Is there a hint somewhere in MSDN that I completely missed?

Categories: Programming Tags: , , ,