Tag Archives: database

MEAN web development #7: MongoDB and Mongoose

Last week we’ve seen some of the basic functionality of AngularJS, at least enough to get you started. Before that we’ve seen Node.js and Express. So that’s EAN and we’re left with the M. Well, Dial M for MongoDB because that’s what we’re going to look at this week.

  1. MEAN web development #1: MEAN, the what and why
  2. MEAN web development #2: Node.js in the back
  3. MEAN web development #3: More Node.js
  4. MEAN web development #4: All aboard the Node.js Express!
  5. MEAN web development #5: Jade and Express
  6. MEAN web development #6: AngularJS in the front
  7. MEAN web development #7: MongoDB and Mongoose
  8. MEAN web development #8: Sockets will rock your socks!
  9. MEAN web development #9: Some last remarks

As usual you can find the examples for this post on my GitHub page in the mean7-blog repository.

Hello persistent data

I’ve already written an entire post on NoSQL and MongoDB, A first look at NoSQL and MongoDB in particular. I’ve already told you to read it in the first part of this series, MEAN web development #1: MEAN, the what and why. If you haven’t read either of those I suggest you do so before continuing because I won’t repeat how to install MongoDB and MongoVUE. Don’t worry I’ll wait…

Before we continue I should mention that everything we’re going to do is async. That means lots of callback functions. We don’t want to block our main thread after all! It also means that the callbacks may not be called in the same order as their ‘parent functions’ are. Or that the records are actually inserted before we query them! Because I wanted to keep it simple in the complete example file I haven’t nested all examples in callbacks, but keep in mind that you may get some odd results. It worked fine for me by the way, if you get some weird results try running the examples one by one (simply commenting out the others).

So let’s just get a Node.js server up and running and write some data to the database real quick! You’ll be surprised how easy it is. First of all install the MongoDB driver using npm (npm install mongodb).
Next we’ll make a connection to our MongoDB instance.

var app = require('express')();
var MongoClient = require('mongodb').MongoClient;
     
var urlWithCreds = 'mongodb://user:password@localhost:27017/local';
var url = 'mongodb://localhost:27017/local';
MongoClient.connect(url, function (err, db) {
    if (err) {
        console.log(err);
    } else {
        console.log('Connected to the database.');
        db.close();
    }
});
var server = app.listen(80, '127.0.0.1');

So first of all we require Express (which isn’t necessary for MongoDB) and MongoDB. We take the MongoClient property of the MongoDB module. We use this client to connect to the database using the connect function, which takes a URI and a callback function. The callback has a MongoError (in case you can’t log in, for example if you have wrong credentials) and a Db as parameters.
We can use the Db object to do all kinds of stuff like creating and dropping databases, collections and indices and do our CRUD operations (Create, Read, Update, Delete). Let’s insert a simple object.

var url = 'mongodb://localhost:27017/local';
MongoClient.connect(url, function (err, db) {
    if (err) {
        console.log(err);
    } else {
        var artist = {
            name: 'Massive Attack',
            countryCode: 'GB'
        };
        var collection = db.collection('artists');
        collection.insertOne(artist);
        console.log(artist._id);
        db.close();
    }
});

As you can see we use the db parameter to get a collection (the MongoDB variant of a database table) using the collection function. If the collection does not exist it will create one automatically. We can then simply insert an object using the insertOne function of the collection. Now something funny has happened. After calling insertOne our artist object suddenly has an _id property. MongoDB uses this _id to uniquely identify objects.
So that wasn’t so bad right? Let’s look at other CRUD functionality!

CRUD with MongoDB

So let’s retrieve the record we just inserted. We can do this using the findOne function of the collection.

MongoClient.connect(url, function (err, db) {
    if (err) {
        console.log(err);
    } else {
        var collection = db.collection('artists');
        collection.findOne({ name: 'Massive Attack' }, function (err, artist) {
            if (err) {
                console.log(err);
            } else {
                console.log(artist);
            }
            db.close();
        });
    }
});

So the object that is passed to the findOne function is actually a search parameter. In this case we’re looking for documents (or records) that have a name equal to ‘Massive Attack’. The second parameter is a callback function that gives us an error, if any occurred, and the document that was retrieved. If you ran the previous example multiple times Massive Attack will be in your database more than once (having different values for _id), in this case findOne simply returns the first document it finds.

So let’s insert a few more artists, just so we’ve got a little set to work with. We can use the insertMany function for this.

collection.insertMany([
{
    name: 'The Beatles',
    countryCode: 'GB',
    members: [
        'John Lennon',
        'Paul McCartney',
        'George Harrison',
        'Ringo Starr'
    ]
},
{
    name: 'Justin Bieber',
    countryCode: 'No one wants him'
},
{
    name: 'Metallica',
    countryCode: 'USA'
},
{
    name: 'Lady Gaga',
    countryCode: 'USA'
}
], function (err, result) {
    if (err) {
        console.log(err);
    } else {
        console.log(result);
    }
});

Now you may think there’s a findMany function as well, but it’s actually just called find. find returns a Cursor which is something like an array, but not quite. We can use the toArray method though. The find function has a query parameter which is just an object that describes what fields of a document must have which values. We can search fields with AND, OR, NOT, IN, greater than, lesser than, regular expressions and everything you’re used to in SQL databases.

var findCallback = function (err, artists) {
    if (err) {
        console.log(err);
    } else {
        console.log('\n\nFound artists:');
        artists.forEach(function (a) {
            console.log(a);
        });
    }
};

// All documents.
collection.find().toArray(findCallback);

// Name not equal to Justin Bieber.
collection.find({ name: { $ne: 'Justin Bieber' } }).toArray(findCallback);

// Name equal to Massive Attach or name equal to The Beatles.
collection.find({ $or: [{ name: 'Massive Attack' }, { name: 'The Beatles' }] }).toArray(findCallback);

// Members contains John Lennon.
collection.find({ members: 'John Lennon' }).toArray(findCallback);

Now let’s update a record.

collection.findOneAndUpdate({ name: 'Massive Attack' },
    { $set: {
        cds: [
            {
                title: 'Collected',
                year: 2006,
                label: {
                    name: 'Virgin'
                },
                comment: 'Best Of'
            },
            {
                title: 'Mezzanine',
                year: 1998,
                label: 'Virgin'
            },
            {
                title: 'No Protection: Massive Attack v Mad Professor',
                year: 1995,
                label: 'Circa Records',
                comment: 'Remixes'
            },
            {
                title: 'Protection',
                year: 1994,
                label: {
                    name: 'Circa'
                }
            }
        ]
        }
    }, function (err, result) {
    console.log('\n\nUpdated artist:');
    if (err) {
        console.log(err);
    } else {
        console.log(result);
    }
});

Here we see the findOneAndUpdate in action. Alternatively we could’ve used updateOne. And for multiple updates we can use updateMany.

Now let’s delete a record. There’s one guy I really don’t want in my database (yes, I’ve added him so I wouldn’t feel guilty about deleting him).  And for this we can, of course, use findOneAndDelete.

collection.findOneAndDelete({ name: 'Justin Bieber' }, function (err, result) {
    console.log('\n\nDeleted artist:');
    if (err) {
        console.log(err);
    } else {
        console.log(result);
    }
});

Really no surprises there. Alternatively there’s deleteOne and to delete many use, you guessed it, deleteMany.

Mongoose

So MongoDB with Node.js looks really good, right? It wasn’t very hard to use. It’s really just a matter of working with JavaScript objects. And as we all know JavaScript objects are very dynamic. In the previous examples we’ve already seen that some artists have a member property defined and then when we updated we all of a sudden had a cds property and some CD’s have a comment while others don’t… And MongoDB has no problem with it at all. We just save and fetch what is there.

Now try this.

var app = require('express')();
var MongoClient = require('mongodb').MongoClient;

var Artist = function (name, activeFrom, activeTo) {
    if (!(this instanceof Artist)) {
       return new Artist(name, activeFrom, activeTo);
    }
    var self = this;
    self.name = name;
    self.activeFrom = activeFrom;
    self.activeTo = activeTo;
    self.yearsActive = function () {
        if (self.activeTo) {
            return self.activeTo - self.activeFrom;
        } else {
            return new Date().getFullYear() - self.activeFrom;
        }
    };
};

var url = 'mongodb://localhost:27017/local';
MongoClient.connect(url, function (err, db) {
    if (err) {
        console.log(err);
    } else {
        var collection = db.collection('artists');
        // Empty the collection
        // so the next examples can be run more than once.
        collection.deleteMany();
        
        var massiveAttack = new Artist('Massive Attack', 1988);
        console.log('\n\n' + massiveAttack.name + ' has been active for ' + massiveAttack.yearsActive() +  ' years.');
        
        collection.insertOne(massiveAttack);
        
        collection.findOne({ name: massiveAttack.name }, function (err, result) {
            if (err) {
                console.log(err);
            } else {
                try {
                    console.log('\n\n' + result.name + ' has been active for ' + result.yearsActive() +  ' years.');
                } catch (ex) {
                    console.log(ex);
                }
            }
        });
        
    }
});

var server = app.listen(80, '127.0.0.1');

What happens is that MongoDB doesn’t store the yearsActive function nor the constructor function. What MongoDB stores are just the non-function values. The result is that when we retrieve our object it will no longer be an Artist object, but just an object that just so happens to have the same properties as an Artist.

This is where Mongoose comes to the rescue! Mongoose adds a schema to your MongoDB objects. Let’s see how that works.

To add Mongoose to your project you can install it using npm install mongoose.

So first we can use mongoose.connect to get a connection to the database.

var app = require('express')();
var mongoose = require('mongoose');

var url = 'mongodb://localhost:27017/local';
mongoose.connect(url);
var db = mongoose.connection;
db.on('error', function (err) {
    console.log(err);
});
db.once('open', function (callback) {
    // ...
});

var server = app.listen(80, '127.0.0.1');

After that we can define a schema using the Schema function.

db.once('open', function (callback) {
    var artistSchema = mongoose.Schema({
        name: String,
        activeFrom: Number,
        activeTo: Number
    });
    artistSchema.methods.yearsActive = function () {
        var self = this;
        if (self.activeTo) {
            return self.activeTo - self.activeFrom;
        } else {
            return new Date().getFullYear() - self.activeFrom;
        }
    };
});

And as you can see I’ve appended the yearsActive function to the artistSchema.methods object. After that we can create a Model using mongoose.model.

var Artist = mongoose.model('Artist', artistSchema);

And after that the Artist variable (a Model) is actually the portal to your collection. It’s also a constructor function for artists. So let’s create our Massive Attack artist.

var massiveAttack = new Artist({ name: 'Massive Attack', activeFrom: 1988 });
console.log('\n\n' + massiveAttack.name + ' has been active for ' + massiveAttack.yearsActive() + ' years.');

And then we can save it using the save function.

massiveAttack.save(function (err, result) {
    if (err) {
        console.log(err);
    } else {
        // ...
    }
});

And now that the artist is saved let’s retrieve it and call that yearsActive function again. We can simply retrieve our object using Model.findOne.

Artist.findOne({ name: massiveAttack.name }, function (err, result) {
    if (err) {
        console.log(err);
    } else {
        try {
            console.log('\n\n' + result.name + ' has been active for ' + result.yearsActive() +  ' years.');
        } catch (ex) {
            console.log(ex);
        }
    }
});

And here I’ve put the findOne directly in the callback function of save, which I didn’t do before. I needed this because calling findOne directly after save didn’t yield any results (timing issue I guess). More importantly it did successfully execute the yearsActive function!

And like with the regular MongoDB driver we can use find, remove, findOneAndRemove and findOneAndUpdate.

So we’ve looked at the MongoDB driver and at the problem of schemaless objects which Mongoose fixes. I can recommend practicing a bit, as it’s really very easy to drop, create, insert, update, read and remove data, and reading the API documentation of both. We’ve only scratched the surface here, but it got you on your way.

And of course I’m going to recommend some additional reading. The Node.js Succinctly book is just a great resource for Node.js in general and it has a tiny bit on MongoDB and SQLite as well. I can also recommend MongoDB Succinctly. And Getting MEAN from Manning even has a chapter on MongoDB and Mongoose.

Happy coding!

A first look at NoSQL and MongoDB in particular

So today I decided to have a look at NoSQL. It’s not exactly new and actually I’m a bit late to jump on the NoSQL train, but so far I had no need for it (and actually I still don’t, but I had some time to spare and a blog to write). Since NoSQL can be quite complicated, as it imposes a new way of thinking about storing data, and I can’t possibly discuss everything there is to discuss, I’ll add some additional reading at the end of the article.

An overview of NoSQL

First things first, what is NoSQL? As the name implies it’s not SQL (Structured Query Language), a standard for databases to support the relational database model. As SQL has been the standard for about thirty to twenty years I’m not going to discuss it, you probably know it. A common misunderstanding with NoSQL is that it stands for “no SQL”, while it actually means “Not Only SQL”, which implies there is at least some SQL-y goodness to be had in NoSQL as well. Whatever that SQL-y goodness may be it’s not the relational model. And this is where NoSQL is fundamentally different from SQL, expect de-normalized and duplicated data. This ‘feature’ makes it possible to make schema’s flexible though. In NoSQL it’s generally easy to add fields to your database. Where in a SQL database you would possibly lock a table for minutes if it contains a bit of data, in NoSQL you can add fields on the fly (during production!). Querying data can also go faster than your typical SQL database, because of the de-normalization you reduce or even eliminate expensive joins. A downside to this method of storing data is that is it harder to get consistency in your data. Where in SQL consistency is more or less guaranteed if you have normalized your database NoSQL offers consistency or eventual consistency. How NoSQL databases provide this (eventual) consistency differs per vendor, but it doesn’t come as natural as in SQL databases. Also, because of the way data is stored and queried NoSQL databases tend to scale better across machines than SQL databases.
Other than that no uniform definition can be given for NoSQL because there is no standard. Still NoSQL can be roughly divided into four database models (some would say more, let’s not get into such details): Document, Graph, Key-value and Wide Column. So let’s get a quick overview of those and try one out!

The Document Model

First there’s the Document model. When thinking of a document don’t think of a Word or Excel document, think of an object like you would have in an object-oriënted language such as Java or C#. Each document has fields containing a value such as a string, a date, another document or an array of values. The schema of a document is dynamic and as such it’s a breeze to add new fields. Documents can be queried on any field.
Because a value can be another document or array of documents data access is simplified and it reduces or even eliminates the use for joins, like you would need in a relational database. It also means you will need to de-normalize and store redundant data though!
Document model databases can be used in a variety of applications. The model is flexible and documents have rich query capabilities. Additionally the document structure closely resembles objects in modern programming languages.
Some examples of Document databases are MongoDB and CouchDB

The Graph Model

Next there’s the Graph model. This model, like its name implies, stores data in graphs, with nodes, edges and properties to represent the data. A graph is a mathematical structure and I won’t won’t go into it any further. Graph databases model data as networks of relationships between entities. Sounds difficult? I think so too. Anyway, when your application is based on various relationships, such as social networks, the graph database is the way to go.
Some examples of Graph databases are HyperGraphDB and Neo4j.

The Key-value Model

Key-value databases are the simplest of the NoSQL databases. They basically provide a key and a value, where the value can be anything. Data can be queried by key only. Each key can have a different (type of) value. Because of this simplicity these databases tend to be highly performant and scalable, however, because of this simplicity, they’re also not applicable to many applications.
Some examples of Key-value databases are Redis and Riak.

The Wide Column Model

Last is the Wide Column model. Like the Key-value model the Wide Column model consists of a key on which data can be queried, can be highly performant and isn’t for each application. Each key holds a ‘single’ value that can have a variable number of columns. Each column can nest other columns. Columns can be grouped into a family and each column can be part of multiple column families. Like the Object model the schema of a Wide Column store is flexible. Phew, and I though the Graph model was complicated!
Some examples of Wide Column databases are Cassandra and HBase.

Getting started with MongoDB

So anyway, there you have it. I must admit I haven’t actually used any of them, but I’m certainly planning to get into them a bit deeper. And actually, as promised, I’m going to try one out right now! I’ve picked MongoDB, one of the fastest growing databases of the moment. It’s a Document store and so has a wider applicability than the other types. You can download the free version at www.mongodb.org. There’s also a lot of documentation on there, so I recommend you look around a bit later. Installation is pretty straightforward. Just click next a few times and install. If you change any settings I won’t be held responsible if it doesn’t work or if you can’t follow the rest of this post. So go ahead, I’ll wait.
Ready? Once you have installed MongoDB you’ll need to run it. I was a bit surprised it doesn’t run as a service (like, for example, SQL Server) by default.
So how do you start MongoDB? Open up a command window (yes, really). First you need to create the data directory where MongoDB stores its files. The default is data\db, to create it type md data\db in your command window. Next you need to navigate to the folder where you’ve installed MongoDB. For me this was C:\Program Files\MongoDB 2.6 Standard\bin. Then start mongod.exe. If, like me, you’ve never had to work with a command window here’s what you need to type in your command window:

cd C:\
md data\db
cd C:\Program Files\MongoDB 2.6 Standard\bin
mongod.exe

If you still encounter problems or you’re not running Windows you can check this Install MongoDB tutorial. It also explains how to run MongoDB as a service, so recommended reading material there!

You might be wondering if MongoDB has a Management System where we can query and edit data without the need of a programming language. You can use the command window to issue JavaScript commands to your MongoDB database. To do this you’ll need to start mongo.exe through a command window. The Getting Started with MongoDB page explains this in greater detail. However I would HIGHLY RECOMMEND that you download MongoVUE instead. It’s an easy to use, graphical, management system for MongoDB. Do yourself a favour and install it before you read any further. You can check out the data we’ll be inserting and editing in the next paragraphs.

One more thing before we continue. Mongo stores its documents as BSON, which stands for Binary JSON. It’s not really relevant right now, but it’s good to know. We’ll see some classes named Bson*, now you know where it comes from. MongoVUE let’s you see your stored documents in JSON format.

The C# side of MongoDB

So now that we are running MongoDB start up a new C# Console project in Visual Studio. Make sure you have saved your project (just call it MongoDBTest or something). Now open up the Package Manager Console, which can be found in the menu under Tools -> Library Package Manager -> Package Manager Console. Getting MongoDB to work in your project is as simple as entering the following command: PM> Install-Package mongocsharpdriver. The MongoDB drivers will be installed and added to your project automatically. Make sure you import the following namespaces to your file:

using MongoDB.Bson;
using MongoDB.Bson.Serialization.Attributes;
using MongoDB.Driver;
using MongoDB.Driver.Builders;
using MongoDB.Driver.Linq;
using System;
using System.Linq;

So are you ready to write some code? First we’ll need something we want to store in our database, let’s say a Person. I’ve created the following class to work with when we start.

public class Person
{
    public ObjectId Id { get; set; }
    public string Name { get; set; }
}

Classes don’t come easier. Notice I’ve used the ObjectId for the Id field. Using this type for an ID field makes Mongo generate an ID for you. You can use any type as an ID field, but you’ll need to set it to a unique value yourself (or you’ll overwrite the record that already has that ID).  Another gotcha is that you need to call your ID field Id (case-sensitive) or annotate it with the BsonIdAttribute. And since we’re talking about Attributes, here’s another one that’ll come in handy soon, the BsonIgnoreAttribute. Properties with that Attribute won’t be persisted to the store.

public class Person
{
    [BsonId()]
    public ObjectId MyID { get; set; }
    public string Name { get; set; }
    [BsonIgnore()]
    public string NotPersisted { get; set; }
}

For now we’ll work with the default Id field. So now let’s make a connection to our instance and create a database. This is actually rather easy as you’ll see. Mongo creates a database automatically whenever you put some data in it. After we got a connection to our database we’ll want to put some data in that database. More specific, we want to create a Person and store it. To do this we’ll first ask for a collection of Persons with a specific name (a table name, if you like). You can store multiple collections of Persons if you use different names for the collections, so beware for typo’s! After we got a collection from the database we’ll create a Person and save it to the database. That’s a lot of stuff all at once, but actually the code is so simple you’ll get it anyway!

// Connect to the database.
string connectionString = "mongodb://localhost";
MongoClient client = new MongoClient(connectionString);
MongoServer server = client.GetServer();
MongoDatabase database = server.GetDatabase("testdb");

// Store a person.
MongoCollection persons = database.GetCollection("person");
Person p1 = new Person() { Name = "Sander" };
persons.Save(p1);
Console.WriteLine(p1.Id.ToString());
Console.ReadKey();

Wow, that was pretty easy, wasn’t it!? Mongo generated an ID for you, as you can see. Next we’re going to get this Person back from our database. There’s a few ways to do this. We can work using the Mongo API or we can use LINQ. Both present multiple methods of querying for one or multiple records. I suggest you read the documentation and experiment a bit. I’ll already show you a couple of methods to get our Person back from the database.

// Using the MongoDB API.
ObjectId id = p1.Id;
Person sanderById = persons.FindOneById(id);
Person sanderByName = persons.FindOne(Query.EQ(p => p.Name, "Sander"));

// Using LINQ.
var sandersByLinq = from p in persons.AsQueryable()
                    where p.Name == "Sander"
                    select p;
Person sander = sandersByLinq.SingleOrDefault();

You’ll notice the Query.EQ. EQ stands for equal and builds a query that tests if a field is equal to a specific value. There are other query types like GT (Greater Than), LT (Less Than), In, Exists etc.

But wait, I’m not happy with this code at all! What Person really needs are LastName and Age fields. Now here comes this flexible schema I’ve been telling you about. Simply add the properties to your class. If you’ll fetch a Person that doesn’t have these fields specified they’ll be set to a default value. In case of Age you might want to use an int? rather than an int, or your already existing Persons will have an age of 0 rather than null.

Person incompleteSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
Console.WriteLine(String.Format("{0}'s last name is {1} and {0}'s age is {2}",
    incompleteSander.Name, incompleteSander.LastName, incompleteSander.Age.ToString()));

incompleteSander.LastName = "Rossel";
incompleteSander.Age = 27;

// Let's save those new values.
persons.Save(incompleteSander);

Console.ReadKey();
// Retrieve the person again, but this time with last name and age.
Person completeSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
Console.WriteLine(String.Format("{0}'s last name is {1} and {0}'s age is {2}",
    completeSander.Name, completeSander.LastName, completeSander.Age.ToString()));

Console.ReadKey();

Now let’s also add an address to Person. Address will be a new class and Person will hold a reference to an Address. Now you can just model this like you always would.

public class Person
{
    public ObjectId Id { get; set; }
    public string Name { get; set; }
    public string LastName { get; set; }
    public int? Age { get; set; }
    public Address Address { get; set; }
}

public class Address
{
    public string AddressLine { get; set; }
    public string PostalCode { get; set; }
}

Notice that Address doesn’t need an Id field? That’s because it’s a sub-document of Person, it doesn’t exist without a Person and as such doesn’t need an Id to make it unique. Now fetch your already existing Person from the database, check that it’s address is empty, create an address, save it and fetch it again.

Person addresslessSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
if (addresslessSander.Address != null)
{
    Console.WriteLine(String.Format("Sander lives at {0} on postal code {1}", addresslessSander.Address.AddressLine, addresslessSander.Address.PostalCode));
}
else
{
    Console.WriteLine("Sander lives nowhere...");
}

addresslessSander.Address = new Address() { AddressLine = "Somewhere", PostalCode = "1234 AB" };
persons.Save(addresslessSander);

Person addressSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
if (addressSander.Address != null)
{
    Console.WriteLine(String.Format("Sander lives at {0} on postal code {1}", addressSander.Address.AddressLine, addressSander.Address.PostalCode));
}
else
{
    Console.WriteLine("Sander lives nowhere...");
}

Console.ReadKey();

Make sure you check out the JSON in MongoVUE. Also try experimenting with Lists of classes. Try adding more Addresses, for example. We haven’t deleted or updated any records either, we’ve only overwritten entire entries. Experiment and read the documentation.

We’ve now scratched the surface of NoSQL and MongoDB in particular. Of course MongoDB has a lot more to offer, but I hope this post has helped getting your feet wet in NoSQL and MongoDB. Perhaps it has given you that little push you needed to get started. It has for me. Expect more NoSQL blogs in the future!

Additional reading

As promised, here’s some additional reading:
NoSQL – Wikipedia
MongoDB White Papers
Document Databases : A look at them
How to take advantage of Redis just adding it to your stack

Comments are welcome. Happy coding!

Using C# to connect to and query from a SQL database

As a developer you’ll probably spend a lot of time getting data in and out of a database. Data is important in any organization and your job as a developer is to present that data to a user, have them add or edit that data and store it back to the database.

Yet I have found that many developers really have no clue how to work with a database! Many developers can get data out of databases, but do so in an unsafe way that may break your code and, worse, give hackers an opportunity to get direct access to your database! Others use an ORM like NHibernate, Entity Framework or LINQ To SQL, but have no idea what’s going on. In this blog post I will address these issues: how to setup a database connection, query for data in a secure manner and use that data in your code. I’ll also show you how to push data back to a database.

I am assuming you know how to set up a database and you know your way around C# and the .NET Framework. For my example I have used the Adventure Works 2014 Sample Database on a SQL Server 2014 database.

Creating a Connection

So let’s start. To create a connection to a database you’ll first need a database connection object. In our case we need a specific type of connection object, being the SqlConnection. Using the SqlConnection you can configure all kinds of settings that are used for your current session to the database. In this blog we’ll use defaults only. For creating the SqlConnection we’ll use the constructor that takes a (connection)string as input parameter. Usually you’d get the connectiongstring from a config file or some such. Alternatively you can create one using the SqlConnectionStringBuilder, but I won’t go into that here. Notice that I’ve wrapped the SqlConnection in a using block. This ensures that the connection is actually closed once we’re done with it. Make sure you actually open the connection only when needed.

using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
{
    connection.Open();
}

Creating a Command

Unfortunately this doesn’t do anything yet. We’ll need a SqlCommand which takes the query we want to send to the database. In this case I’m going to select all persons from the table Person.Person. We can create a command object in different ways, but I’m going to create one using the constructor that takes the query and our just created connection. Once we have created the command we must open the connection (which we already did) and have it execute our query. There are a few ways to have the command actually execute your query.

Executing a Command

The first is ExecuteNonQuery, which seems odd because we are going to execute a query, right? Well actually you use this method when you don’t expect a result (perhaps and update statement, or a call to a Stored Procedure that returns no resultset).

The second method, and the one we’ll need in this example, is ExecuteReader. This method returns a SqlDataReader which represents a forward-only stream of rows from the database. The columns of each row can be accessed by index or name. We’ll see how to use the SqlDataReader in the next example.

The third method, and last I will discuss, is ExecuteScalar. You can use this method when you expect exactly one result from a query.
There’s also an ExecuteXmlReader method which I will not discuss here. Additionally every method has its async versions. For older versions of .NET these are the BeginExecute and EndExecute methods and for later versions of .NET these are the ExecuteAsync methods. I will not discuss them here.

So let’s look at our example. We’re going to create a command to fetch some data from the Person.Person table and use ExecuteReader to get our results.

List persons = new List();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
{
    connection.Open();
    using (SqlDataReader reader = cmd.ExecuteReader())
    {
        // Check is the reader has any rows at all before starting to read.
        if (reader.HasRows)
        {
            // Read advances to the next row.
            while (reader.Read())
            {
                Person p = new Person();
                // To avoid unexpected bugs access columns by name.
                p.ID = reader.GetInt32(reader.GetOrdinal("ID"));
                p.FirstName = reader.GetString(reader.GetOrdinal("FirstName"));
                int middleNameIndex = reader.GetOrdinal("MiddleName");
                // If a column is nullable always check for DBNull...
                if (!reader.IsDBNull(middleNameIndex))
                {
                    p.MiddleName = reader.GetString(middleNameIndex);
                }
                p.LastName = reader.GetString(reader.GetOrdinal("LastName"));
                persons.Add(p);
            }
        }
    }
}
// Use persons here...

You may have noticed that getting a value from a SqlDataReader isn’t easy! There are methods like GetString, GetInt32, GetBoolean, etc. to convert values from their database representation to their CLR type equivalents. Unfortunately they throw on DBNull values. So in case of MiddleName, which is a NULLABLE column in the database, we need to check for DBNull before setting the MiddleName value. In case of integer or booleans (or any non-nullable type) we would use the nullable equivalents of those types like int? or bool? (which is short for Nullable<T>).

Using an Adapter

Another method to get data from the database is by using a SqlDataAdapter. This results in a DataTable or DataSet (for multiple resultsets) containing the database data. I won’t go into the use of DataTables and DataSets, but they are like in-memory GridViews. They even track if a row was changed and can automatically generate update, insert or delete commands when used with a SqlCommandBuilder.
The next code snippet shows how to fill a DataTable (that’s a lot less code than the SqlDataReader example, but keep in mind that the result is also very different).

DataTable table = new DataTable();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(table);
}
// Use table here...

SQL Injection

For the next example we are going to select a subset of persons by first name. That means we’ll have to change our query. Let’s look at an example.

string firstName = "John";
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person WHERE FirstName = '" + firstName + "'", connection))
{
    // ...
}

Looking good, right? NO! THIS IS REALLY VERY WRONG! For John this works great (I’ll tell you in a moment why it works, but still isn’t great), but for D’Artagnan (a musketeer) this won’t work at all! While the apostrophe is all good in C# it ends a string in SQL. So the query you’ll be sending to SQL is SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person WHERE FirstName = D'Artgnan. Go to SQL server Management Studio, open a new query window and try to run that exact query. You’ll get an error message saying something about an unclosed quotation mark. What it should’ve been was D”Artagnan. But even replacing every apostrophe with double apostrophe won’t work.

Whenever you send a query to SQL Server a query plan is made and the fastest way to get your data is calculated. For our query SQL Server might decide it will use an index we placed on FirstName. Once the plan is decided it’s cached and re-used when the exact same query is called. In our example that would mean a plan is made and cached for each name we look for! That’s not very efficient since every plan will probably be the same anyway…

You’ve been HACKED!

What’s even worse and THIS IS VERY IMPORTANT is that by concatenating strings to form a query like that is a HUGE SAFETY RISK! Maybe you’ve heard of SQL Injection Attacks. Let me demonstrate this. Let’s assume for a moment that the user gets a textbox to enter a name and that name is concatenated to your query like above. Now the user enters John'; USE master; DROP DATABASE AdventureWorks2014 -- and BAM! There goes your database… Really, it’s gone. I hope you have a backup. This technique is used to get personal information of users like email addresses and passwords.
Here is a mandatory xkcd on the subject:

xkcd: Exploits of a Mom

Parameterization

So how are we going to solve these problems? Parameterization! By creating parameterized queries the query plan can be re-used for different values and SQL injection belongs to the past! So how does this look?

string firstName = "John";
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person WHERE FirstName = @FirstName", connection))
{
    cmd.Parameters.AddWithValue("FirstName", firstName);
    connection.Open();
    using (var reader = cmd.ExecuteReader())
    {
        // ...
    }
}

And that’s how easy it is! Notice that by adding a parameter we also improved the readability of our code. Wow, that’s a win-win-win situation!

There is one caveat though, when you want to pass a NULL to the database you’ll have to use the DBNull.Value object instead of simply null. So when fetching data we converted DBNull to null and now we’ll have to convert null to DBNull. We’ll see this happening in the next example.

Now what if we want to update, insert or delete a record in the database? We can go about it in much the same way, but use ExecuteNonQuery (which returns the number of affected rows only).

int businessEntityID = 1;
string firstName = "Sander";
string middleName = null;
string lastName = "Rossel";
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("UPDATE Person.Person SET FirstName = @FirstName, MiddleName = @MiddleName, LastName = @LastName WHERE BusinessEntityID = @BusinessEntityID", connection))
{
    cmd.Parameters.AddWithValue("FirstName", firstName);
    if (middleName == null)
    {
        cmd.Parameters.AddWithValue("MiddleName", DBNull.Value);
    }
    else
    {
        cmd.Parameters.AddWithValue("MiddleName", middleName);
    }
    cmd.Parameters.AddWithValue("LastName", lastName);
    cmd.Parameters.AddWithValue("BusinessEntityID", businessEntityID);
    connection.Open();
    cmd.ExecuteNonQuery();
}

I have to add that it’s generally a good idea to check for null for ALL your parameters. You can make a helper function to prevent your code from cluttering up to much.
And in case you want your original Person back, here are his first-, middle- and last name: Ken J Sánchez.

Stored Procedures

So far we have only worked with plain text queries. Many times you’ll want to execute a stored procedure. This works in much the same way as sending your query to the database. You simply have to set the CommandType of your command to StoredProcedure and pass in the parameters.

int businessEntityID = 1;
string nationalIDNumber = "295847284";
DateTime birthDate = new DateTime(1987, 11, 8);
char maritalStatus = 'S';
char gender = 'M';
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("HumanResources.uspUpdateEmployeePersonalInfo", connection))
{
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.Parameters.AddWithValue("BusinessEntityID", businessEntityID);
    cmd.Parameters.AddWithValue("NationalIDNumber", nationalIDNumber);
    cmd.Parameters.AddWithValue("BirthDate", birthDate);
    cmd.Parameters.AddWithValue("MaritalStatus", maritalStatus);
    cmd.Parameters.AddWithValue("Gender", gender);
    connection.Open();
    cmd.ExecuteNonQuery();
}

In case you want your original employee back, here is his original birthdate: 1969-01-29.

Other Databases

Perhaps you have noticed that the SqlConnection inherits from DbConnection which implements IDbConnection. We have also used other classes like the SqlCommand and SqlDataReader which inherit from DbCommand and DbDataReader in a same manner. The only thing you need to know right now is that many database providers have these classes as a common base class which means that if you know how to connect to SQL Server you (more or less) know how to connect to most SQL databases like Oracle, MySQL, PostgreSQL, Firebird, etc. In theory (and probably in practice too, although I’ve never tried) you can create a flexible data layer that can switch seamlessly between (SQL) databases because of these common base classes and interfaces.

Wrap up

Well, there you have it. We have successfully and correctly selected data, updated data and executed a stored procedure using C#. I assume you can now guess how to use ExecuteScalar, which I mentioned, but haven’t discussed further. Things don’t stop here though. There’s much more like queries that return multiple result sets, stored procedures that return output parameters, BLOB’s, bulk operations, transactions… Way to much to discuss here. Luckily there are many books, articles and blogs on the subject.

Happy coding!