Tag Archives: C#

Azure Functions and Serverless Computing

The post “Azure Functions and Serverless Computing” appeared first on MSDN Azure Development Community.

In my previous blog post, WebJobs in Azure with .NET Core 2.1, I briefly mentioned Azure Functions. Azure Functions are usually small (or somewhat larger) bits of code that run in Azure and are triggered by some event. Azure takes complete care of the entire infrastructure of your Functions making it a so-called serverless solution. The only thing you need to worry about is your code.

The code samples for this post can be found on my GitHub profile. You shouldn’t really need it because I’m only using default templates, but I’ve included them because I needed them in source control for CI/CD anyway.

Serverless computing

Before we get into Azure Functions I’d like to explain a bit about serverless computing. Serverless computing isn’t completely serverless, your code still needs somewhere to run after all. However, the servers are completely managed by your cloud provider, which is Azure in our case.

With most App Services, like a Web App or an API App, you need an App Service Plan, which is basically a server that has x CPU cores and y memory. While you can run your Functions on an App Service Plan it’s far more interesting to run them in a so-called “Consumption Plan”. With a consumption plan, your resources are completely dynamic, meaning you’re only actually using a server when your code is running.

It’s cheap

Let me repeat that, you’re only actually using a server when your code is running. This may seem trivial, but has some huge implications! With an App Service Plan, you always have a server even when no code is running. This means you pay a monthly fee just to keep the server up and running. See where this is heading? That’s right, with a consumption plan you pay only when your code is running because that’s the only time you’re actually using resources!

There’s a pretty complicated formula using running time, memory usage and executions, but believe me, Functions are cheap. Your first million executions and 400,000 GB/sec (whatever that means) are free. Microsoft has a pricing example which just verifies that it’s cheap.

It’s dynamic

Price is cool, but we ain’t cheap! Probably the coolest feature about Functions is that Azure scales servers up and down depending on how busy your function is. So let’s say you’re doing some heavy computing which takes a while to run and is pretty resource intensive, but you need to do that 100 times in parallel… Azure just spins up some extra servers dynamically (and, of course, you pay x times more, but that’s still pretty cheap). When your calculations are complete your servers are released and you stop paying. There is a limit to this behavior. Azure will spin up a maximum of 200 server instances per Function App and a new instance will only be allocated at most once every 10 seconds. One instance can still handle multiple executions though, so 200 servers usually does not mean 200 concurrent executions.

That said, you can’t really configure a dynamically created server. You’ll have to trust Microsoft that they’re going to temporarily give you a server that meets your needs. And, as you can guess, those needs better be limited. Basically, your need should be that your language environment, such as the .NET Framework (or .NET Core) for C#, is present. One cool thing though, Next to C#, Functions can be written in F# and JavaScript, and Java is coming in v2. There are some other languages that are currently running in v2, like Python, PHP, and TypeScript, but it looks like Microsoft isn’t planning on fully supporting these languages in the near future.

Azure Functions in the Azure Portal

Let’s create our first Azure Functions. Go to App Services and create a new one. When you create an App Service you should get an option to create a Function App (next to Web App, API App, etc.). The creation screen looks pretty much the same as for a regular web app, except that Function Apps have a Consumption Plan as a Hosting Plan (which you should take) and you get the option to create a Storage Account, which is needed for Azure to store your Functions.

Create a new Function App
Create a new Function App

This will create a new Function App, a new Hosting Plan, and a new Storage Account. Once Azure created your resources look up your Function Apps in your Azure resources. You should see your version of “myfunctionsblog” (which should be a unique name). Now, hover over “Functions” and click the big plus to add a Function.

Azure Functions
Azure Functions

In the next form, you can pick a scenario. Our options are “Webhook + API”, “Timer” or “Data processing”. You can also pick a language, “CSharp”, “JavaScript”, “FSharp” or “Java”. Leave the defaults, which are “Webhook + API” and “CSharp” and click the “Create this function” button.

Editing and running your Azure Functions

You now get your function, which is a default template. You can simply run it directly in the Azure portal and you can see a log window, errors and warnings, a console, as well as the request and response bodies. It’s like a very lightweight IDE in Azure right there. I’m not going over the actual code because it’s pretty basic stuff. You can play around with it if you like.

Next to the “Save” and “Run” buttons you see a “</> Get function URL” link. Click it to get the URL to your current function. There are one or more function keys, which give access to just this function); there are one or more host keys, which give access to all functions in the current host; and there’s a master key, which you should never share with third parties. You can manage your keys from the “Manage” item in the menu on the left (with your Functions, Proxies, and Slots).

Consuming Azure Functions

The Function we created is HTTP triggered, meaning we must do an HTTP call ourselves. You can, of course, use a client such as Postman or SoapUI, but let’s look at some C# code to consume our Function. Create a (.NET Core) Console App in Visual Studio (or use my example from GitHub) and paste the following code in Program.cs.

If your Function returns an error (for whatever reason) your best logging tool is Application Insights, but that is not in the scope of this blog post.

Of course, if you have a function that’s triggered by a timer or a queue or something else you don’t need to trigger it manually.

Azure Functions in Visual Studio 2017

Creating a Function in the Azure Portal is cool, but we want an IDE, source control, and CI/CD.  So open up VS2017 and look for the “Azure Functions” template (under Visual C# -> Cloud). You can choose between “Azure Functions v1 (.NET Framework)” and “Azure Functions v2 Preview (.NET Standard)”. We like .NET Standard (which is the common denominator between the .NET Framework, .NET Core, UWP, Mono, and Xamarin) so we pick the v2 Preview.

Again, go with the HTTP trigger and find your Storage Account. Leave the access rights on “Function”. The generated function template is a little bit different than the one generated in the Azure portal, but the result is the same. One thing to note here is that Functions actually use the Microsoft.Azure.WebJobs namespace for triggers, which once again shows the two can (sometimes) be interchanged.

When you run the template you get a console which shows the Azure Functions logo in some nice ASCII art as well as some startup logging. Windows Defender might ask you to allow access for Functions to run, which you should grant.

Now if you run the Console App we created in the previous section and change the functions URL to “http://localhost:7071/api/Function1” (port may vary) you should get the same result as before.

Deployment using Visual Studio

If you’ve read my previous blog posts you should be pretty familiar with this step. Right-click on your Functions project and select “Publish…” from the menu. Select “Select Existing” and enable “Run from ZIP (recommended)”. Deploying from ZIP will put your Function in a read-only state, but it most closely matches your release using CI/CD. Besides, all changes should be made in VS2017 so they’re in source control. In the next form find your Azure App and deploy to Azure.

If you’ve selected Azure Functions v2 earlier you’ll probably get a dialog telling you to update your Functions SDK on Azure. A little warning here, this could mess up already existing functions (in fact, it simply deleted my previous function). As far as I understand this only affects your current Function App, but still use at your own risk.

Once again, find your Functions URL, paste it in the Console App, and check if you get the desired output.

Deployment using VSTS

Now for the good parts, Continuous Integration and Deployment. Open up Visual Studio Team Services and create a new build pipeline. You can pick the default .NET Core template for your pipeline, but you should change the “Publish” task so it publishes “**/FunctionApp.csproj” and not “Publish Web Projects”. You can optionally enable continuous integration in the “Triggers” tab. Once you’re done you can save and queue a build.

The next step is to create a new release pipeline. Select the Azure “App Service deployment” template. Change the name of your pipeline to something obvious, like “Function App CD”. You must now first add your artifact and you may want to enable the continuous deployment trigger. You can also rename your environment to “Dev” or something.

Next, you need to fill out some parameters in your environment tasks. The first thing you need to set is your Azure subscription and authorize it. For “App type” pick the “Function App”. If you’ve successfully authorized your subscription you can select your App Service name from the drop-down. Also, and this one is a little tricky, in your “Azure App Service Deploy” task, disable “Take App Offline” which is hidden under the “Additional Deployment Options”. Once you’re done, and the build is finished, create a new release.

When all goes well, test your changes with the Console App.

If everything works as expected try changing the output of your Function App, push it to source control, and see your build and release pipelines do their jobs. Once again, test the change with your Console App.

Working around some bugs

So… A header that mentions bugs and workarounds is never a good thing. For some reason, my release kept failing due to an “invalid access to memory location”. Probably because I had already deployed the app using VS2017. I also couldn’t delete my function because the trash can icon was disabled. Google revealed I wasn’t the only one with that problem. So anyway, I am currently using a preview of Azure Functions v2 and I’m sure Microsoft will figure this stuff out before it goes out of preview.

Here’s the deal, you probably have to delete your Function App completely (you can still delete it through your App Services). Recreate it, go to your Function App (not the function itself, but the app hosting it, also see the next section on “Additional settings”), and find the “Function app settings”. Over here you can find a switch “~1” and “beta” (which are v1 and v2 respectively). Set it on “beta” here. Now deploy using VSTS. Publishing from VSTS will cause your release to fail again.

Bottom line: don’t use VS2017 to deploy your Function App!

Additional settings

There’s just one more thing I’d like to point out to you. While Azure Functions look different from regular Web Apps and Web APIs they’re still App Services with a Hosting Plan. If you click on your Function App you land on an “Overview” page. From here you can go to your Function app settings (which includes the keys) and your Application settings (which look a lot like Web App settings). You’ll find your Application settings, like “APPINSIGHTS_INSTRUMENTATIONKEY”, “AzureWebJobsDashboard”, “AzureWebJobsStorage” and “FUNCTIONS_EXTENSIONS_VERSION”.

Another tab is the “Platform features” tab, which has properties, settings and code deployment options (see my post Azure Deployment using Visual Studio Team Services (VSTS) and .NET Core for more information on deployment options).

Functions Platform features
Functions Platform features

Wrap up

Azure Functions are pretty cool and I can’t wait for v2 to get out of preview and fully support .NET Standard as well as fix the bugs I mentioned. Now, while Functions may solve some issues, like dynamic scaling, it may introduce some problems as well.

It is possible to create a complete web app using only Azure Functions. Whether you’d want that is another question. Maybe you’ve heard of micro-services. Well, with Functions, think nano-services. A nano-service is often seen as an anti-pattern where the overhead of maintaining a piece of code outweighs the code’s utility. Still, when used wisely, and what’s wise is up to you, Functions can be a powerful, serverless, asset to your toolbox. If you want to know more about the concepts of serverless computing I recommend a blog post by my good friend Sander Knape, who wrote about the AWS equivalent, AWS Lambda, The hidden challenges of Serverless: from VM to function.

Don’t forget to delete your resources if you’re not using them anymore or your credit card will be charged.

Happy coding!

WebJobs in Azure with .NET Core 2.1

The post “WebJobs in Azure with .NET Core 2.1” appeared first on MSDN Azure Development Community.

WebJobs aren’t new to Azure or .NET. There’s even a default Azure WebJob template in Visual Studio 2017 for the full .NET Framework. However, a similar template for WebJobs in .NET Core is somehow missing from Visual Studio. In this post, I’m using .NET Core 2.1.

Creating a WebJob in .NET Core isn’t hard, but you have to know some tricks, especially if you want to use some .NET Core goodies like logging and DI.

In this post, we’re going to build a WebJob and release it to Azure using Visual Studio, the Azure portal, and VSTS.

You can find the code samples for this post on GitHub.

What are WebJobs

A WebJob is a program running in the background of an App Service. It runs in the same context as your web app at no additional cost. Maybe you need to do some hourly task or do some cleanup task every night at 1 AM. Azure Application Insights uses a WebJob to report your app’s statistics.

WebJobs can be scheduled, like hourly or daily, but they can also be triggered. A trigger could be a file upload or a new message on a queue.

WebJobs vs. Functions

I’ve often found comparisons between WebJobs and Azure Functions. In a way, Functions are the successors to WebJobs. Functions are (usually) small pieces of code that run in Azure and are, just like WebJobs, triggered at a certain event, including an HTTP trigger.

Functions are often a great alternative to WebJobs, but if you already have a web app it could make sense to use a WebJob instead. Especially if you want to share code and/or settings between the WebJob and your web app as they run in the same context, which also makes deployment quite easy.

Creating a Storage Account

Before we continue let’s take care of something first. A WebJob requires an Azure Storage Account.  I’ll quickly walk you through the process of creating one.

In Azure, find “Storage Accounts” and add one. You’ll have to pick a name that’s unique across Azure. Other than that you can leave the defaults. We’re talking about cents per GB, so don’t worry about costs too much.

Once your Storage Account is ready, select it and find your “Access keys”. We’ll need one of the two connection strings later.

Creating a WebJob

As said, there’s a WebJob template for the full .NET Framework. I recommend you check it out. Start by creating an ASP.NET Web Application and then add a new WebJob. If you try to create the WebJob right away you’ll get an error saying that the project needs to be saved first (although it does create the WebJob).

We’re here for the .NET Core version of a WebJob though. So start by creating an ASP.NET Core Web Application and then add a new .NET Core Console App project to your solution.

The first thing we need to do to is install the Microsoft.Azure.WebJobs package from NuGet. We should also install Microsoft.Azure.WebJobs.Extensions. Here’s the catch though, the latest stable versions of these libraries have dependencies on the full .NET Framework so we’re going to need version 3.0.0-beta5 (at the time of this writing), which is fully compatible with .NET Core.

Other NuGet packages we’ll need are Microsoft.Extensions.Options.ConfigurationExtensions (which also gives us the Microsoft.Extensions.Options package, which we also need), Microsoft.Extensions.DependencyInjection and Microsoft.Extensions.Logging.Console. Be sure to install version 2.1.0 of these packages because there seems to be a bug in .NET Core 2.1 that prevents you from using packages with patch versions, like 2.1.1.

Join the Program

The next thing we need to do is change our Program.cs file. If you’ve created a WebJob using the .NET Framework template you can simply copy and paste the Program.cs file that was generated there (although you might want to change the namespace).

Adding Configuration and DI

So I promised you’d get all the .NET Core goodies like logging and DI. By default, a Console App doesn’t have any of that, but you can add it yourself.

Next, create an appsettings.json file and set the “Copy to Output Directory” property to “Copy always”. The appsettings.json file should have two connection strings as mentioned in the Program.cs template file. These are the Storage Account connection strings we created earlier.

The next thing we need is a custom IJobActivator that can be used to inject dependencies into our classes. It needs to be set on the JobHostConfiguration in the Program class.

Adding a Trigger

After that, create a class and name it Functions (just like in the WebJob template). The Functions class will have the actual code for our WebJob.

Of course, we’ll need to add a trigger. This is different than the full .NET Framework. After all, the template uses a static method, which makes DI impossible. Speaking of DI, notice that we’ve also added the Functions class itself to the DI container.

For simplicity, we’ll use a TimerTrigger, which is triggered with a so-called CRON expression. This simply means it’s triggered at a certain minute, hour, day, etc. In this example, it triggers every minute.

We’ll also need to configure timers on the JobHostConfiguration.

Running the example

If you did everything correctly, or if you’re running my code from GitHub, you should now be able to run the Console App. If you break on exceptions or if you’re watching the Output window you may notice a lot of StorageExceptions. Don’t worry about them and ignore them. This is a bug in the WebJobs library and will not affect your program. It may take a minute for your trigger to go off, so have a little patience.

If you head over to your Azure Storage Account you should see two Blob Containers, “azure-jobs-host-output” and “azure-webjobs-hosts”. There’s quite a lot going on here, but you can just ignore it. I’ve found that my WebJob triggers wouldn’t go off for some reason, deleting the Blob Containers usually helped. Apparently, there’s some state stored in there which isn’t always disposed of properly when (re-)adding and removing WebJobs.

Deploying to Azure

The next thing we want to do is deploy our WebJob to Azure. In order for a WebJob to run it needs some executable script that it can use to get going. Many file types are supported, but for us Windows users it makes sense to use an exe, cmd, bat or PowerShell file.

A Console App used to be an exe file, but in .NET Core, it produces a regular DLL file that we need to start manually. So, create a file and name it “run.cmd” and make sure that it’s encoded in UTF-8 no BOM (you can check this using something like Notepad++). It just needs a single line of code, which is “dotnet NetCoreWebJob.WebJob.dll”. This runs your Console App. Make sure you set the “Copy to Output Directory” of the file to “Copy always”.

One last thing, for some reason Azure WebJobs needs all the dependencies of a WebJob, which means all .NET Core packages we used to build it. You can do this by editing the csproj file and adding “<CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies>” to the first <PropertyGroup> (underneath “<TargetFramework>”).

Before we can deploy our WebJob we need to deploy our web app. Right-click the ASP.NET project and click “Publish…”. Simply follow the wizard and Visual Studio will deploy your app for you. You can create a new web app or select an existing one. This step isn’t strictly necessary as you can host stand-alone WebJobs, but this should be familiar and it gives you an App Service we can use for our WebJob.

Deploy using Visual Studio

Deploying WebJobs using Visual Studio should be easy as pie. In fact, you probably already know how to do this (don’t do it yet though). Right-click your WebJob project and click “Publish…”. The following wizard looks a whole lot like the publication of a web app, which we just did. You can pick “Select existing” and pick the Azure web app we just created.

Unfortunately, Microsoft messed up this feature in the worst way possible. Visual Studio will deploy the WebJob with the same name as the project, which is “NetCoreWebJob.WebJob”, except the dot is an illegal character in a WebJob name! This messed up my project so bad I had to manually edit it to make my solution working again. Nice one, Microsoft!

So here’s what you do. At the start of the wizard, where you pick either a new or existing App Service, click the arrow next to “Publish immediately” and pick “Create Profile” instead. Now you can first change the name of your WebJob in the settings and deploy after that. Make sure you don’t select “Remove additional files at destination”  or you’ll remove your web app.

Now, browse to the Azure Portal and look up your web app. You’ll find “WebJobs” in the menu. You’ll see your WebJob, but it’s not doing anything. You need to manually run it by selecting it and clicking “Run”. The status should update to “Running”. You can now check out the logs to see that it actually works. You may see an error about the connection strings, but you can ignore those. If you toggle the output you’ll still see a log is written to the console which lets you know it works! If you don’t see a log right away try waiting a minute or two and don’t forget to manually refresh the output.

WebJobs in Azure
WebJobs in Azure

Deploy using the Azure Portal

When you add a new WebJob you’ll need to fill out some options. You can make up some name, set the type to “Triggered” and the triggers to “Manual”. Your alternatives are a “Continuous” WebJob, which just runs and closes (unless you’ve implemented an endless loop in your application); and a “Scheduled” triggered job, which is basically what we have except we’ve implemented the schedule ourselves.

The “File upload” needs a bit of explanation. Here you can upload a zip file that contains your WebJob. So head over to Visual Studio and build your solution. Then go to the output folder of your WebJob project, something like “MyProject\bin[Debug|Release]\netcoreapp2.1”, and put everything in that folder into a zip file. Then select it in the “File upload” in your new WebJob.

Add WebJob
Add WebJob

It will take a few seconds for Azure to create your WebJob so keep refreshing until it pops up. After that, you have to start it manually again and you can check out the logs.

Deploy using VSTS

Ultimately, we want to add our WebJob to our CI/CD pipeline in VSTS. Unfortunately, this functionality doesn’t come out of the box. Luckily, it’s not very difficult either. If you’re not familiar with builds or releases in VSTS check out one of my previous blogs Azure Deployment using Visual Studio Team Services (VSTS) and .NET Core or ARM Templates to Deploy your Resources to Azure.

When you’re in the Azure Portal find the “App Service Editor (Preview)” of your App Service. This lets you browse all the files in your App Service. One thing we notice is that your WebJob is located in “App_Data\jobs\triggered[WebJob name]”. And since your WebJob is really just the output of the WebJob project build it’s simply a matter of copying your WebJob files to App_Data.

WebJob file location
WebJob file location

The build

So go to VSTS and create a new build. Select your repository, your branch, and select “ASP.NET Core” as a template. We only need to change two things here. We need to change the existing “Publish” task and add a new “.NET Core” task to publish our WebJob.

Change the name of the existing publish task to “Publish web app”, untick the “Publish Web Projects” checkbox, and enter the “Path to project(s)”, which is “**/NetCoreWebJob.csproj”. Also, untick the “Zip Published Projects” and “Add project name to publish path” checkboxes as they will eventually mess up our release.

After that create a new .NET Core task, set the “Command” to “publish” and change the name of the task to “Publish web job”. Again, untick “Publish Web Projects” and set the “Path to project(s)”, which is “**/NetCoreWebJob.WebJob.csproj”. And once again, don’t zip the published projects or add the project name to the publish path. The last step here is the “Arguments” field, which can be copy/pasted from the other publish step, except we’re going to add a bit to it: “–configuration $(BuildConfiguration) –output $(build.artifactstagingdirectory)\App_Data\jobs\triggered\WebJobVSTS”.

VSTS WebJob build
VSTS WebJob build

The release

Last, but not least, is the release. Create a new release in VSTS, pick the “Azure App Service deployment” template and fill in the blanks, which is an artifact and your Azure settings in the Environment. Because we’re not zipping our build we just need to change one setting. In the “Deploy Azure App Service” task is a “Package or folder” setting, which has a default of “[…]/*.zip” which obviously isn’t going to work. Instead, use the browser (the button with “…”) and select your drop folder.

Save it, hit new release and pick your latest build. If all goes well you should see your new WebJob in the Azure Portal!

Wrap up

Hopefully, Microsoft will come with a built-in solution for creating, consuming, and deploying WebJobs to Azure in .NET Core soon. Until that time it’s not particularly difficult, it’s just a matter of knowing what to do.

In this post, we’ve seen the TimerTrigger, but there’s a QueueTrigger, a BlobTrigger, and a FileTrigger as well. Experiment, google, and read the official documentation.

Don’t forget to delete your resources.

Happy coding!

Azure Deployment using Visual Studio Team Services (VSTS) and .NET Core

Hey all,
After three years of radio silence I decided to write a new blog post *crowd goes wild*! A lot has happened since my last post. I wrote a book on Continuous Integration, Delivery, and Deployment (with JavaScript and Jenkins, mostly); got a Microsoft Azure certificate; found a new job using those Azure skills; and I got my own company on the side. Three years ago I ended with a blog series on math, but I’m not finishing that. Apologies if you were really looking forward to #4… My new focus, for now, will mainly be Microsoft Azure, Visual Studio Team Services (VSTS), and C# .NET Core. So without further delay, let’s talk about those!

Microsoft Azure

Microsoft Azure has been around for a few years now and you’ve probably heard of it. It’s the Microsoft cloud platform that competes directly with Amazon’s AWS and Google’s Google Cloud. If you don’t really know what “the cloud” is (besides the pretty hyped buzzword) I suggest you really look into it as I believe it’s the future (heck, it’s the present)! If you do not already have an account I suggest you create one. It’s free and you only need a credit card and a Microsoft email account like hotmail or outlook.

Beware, Azure is free for now. You get free credit which is valid for one month. Some services cost a monthly fee, for others you pay per use. Just having an empty Azure account is always free. In this post we’re going to host a default web application in Azure using an App Service (which, in any case, will cost you close to nothing).

Visual Studio Team Services (VSTS)

VSTS is Microsoft’s cloud version of Team Foundation Service (TFS), which is their tool for continuous integration and deployment, source control, and agile tools like SCRUM and Kanban boards. Again, if you do not yet have a subscription I recommend you create one. Just like with Azure it’s free and you need a Microsoft email address, like hotmail or outlook. For the free version of VSTS you get quite a bit of functionality, like 240 free build minutes per month and unlimited free private Git repositories (always use Git, never Team Foundation Service Control (TFSC)). With an MSDN subscription you get some extra functionality, but the free tier is good enough for this post.

.NET Core

Do I really need to explain this? For the last few years Microsoft has focused on .NET Core, an open source multi-platform subset of the .NET Framework. While it still has some issues, even in the latest .NET Core 2.1 release, it’s also pretty cool and I’m already using it for production software. It’s a bit different than the full .NET Framework, but not a lot. You may have to download and install the latest SDK, which can all be found on the .NET Core GitHub page. Visual Studio support for .NET Core starts from Visual Studio 2017, so be sure you’re up to date, otherwise you can probably follow along with a regular .NET Framework ASP.NET Web Application.

Creating a .NET Core project

So let’s put these tools together. We start by creating a new (private) project in VSTS, which is pretty simple. In fact, you probably already created one when you created a VSTS account. I kept the default, MyFirstProject. When you browse to the Code tab of your project in VSTS you can initialize your repository (at the bottom of the page) with a README.md file and a .gitignore file (choose Visual Studio).

Initialize a repository in VSTS
Initialize a repository in VSTS.

You can copy the link or clone directly into Visual Studio on the next page in the right upper corner under the “Clone” button. I’m assuming you know how Git works, so I’m not covering that here.

Once you have the repository locally we can start by adding some code. Since we’re focusing on deploying to Azure using VSTS it doesn’t really matter what code you have, so just create an ASP.NET Core Web Application in Visual Studio 2017. You can pick a regular Web Application (which will use Razor View Pages by default) or a Model-View-Controller project, whichever you fancy. Once the default web application template loads you can commit and push it to your Git repository and we can start the fun!

Azure App Services

For the next step we’re going to mess around in Azure a bit. Log in to your Azure portal and find your App Services. Simply click the “Add” button, pick a “Web App” and hit “Create”. On the next page you can set some settings for your web app, such as the URL at which your app will be hosted (which always ends at .azurewebsites.net).

New Azure Web App
New Azure Web App

You’ll also have to create a resource group or pick an existing one. A resource group is simply a way to group certain resources for a quick overview. For example, you can have a separate resource group for dev, test, acceptance and production or for your customer portal and your internal tools, it’s all up to you.

The most important setting here is the App Service Plan/Location. This is basically a server that you pay for. A new plan is generated for you, but you can create your own as well. You have different price tiers with different hardware specifications and Azure functionality, like CPU, memory, custom domains, staging slots, and daily backups.

The location of your plan determines where the server is physically located. For me, in the Netherlands, West Europe is ideal because I know it’s in Amsterdam. Just go with the default Standard S1 plan. Even if you don’t have free credits you can delete it after you’ve read this blog post and it will only cost you like $0.03, so don’t worry about costs.

Setting up deployment from Azure

For now we’re staying in Azure. You’ve probably seen the “Build and Release” button in VSTS. That’s where you can create new build and deployment pipelines, which is great, but also a bit of work.

A build pipeline can automatically build and test your software after a push to Git. This includes restoring NuGet packages, minifying and bundling any HTML, CSS and JavaScript you might have, executing task runners and basically anything you’d ever need to make your software build and run. When your latest commit is built and tested VSTS can create an artifact for publication.

The deployment pipeline can download the artifact from a build pipeline and put it in Azure (or on your own on-premise servers). The trigger for a deployment is usually that a build has succeeded, but can also be a push to Git or a manual trigger.

When you’re building and testing your software automatically this is called Continuous Integration (or CI) and when your software is also automatically deployed this is called Continuous Deployment (or CD). There’s something in between, which holds off the deployment until a user explicitly presses a button and gives the “okay” sign, this is called Continuous Delivery. When you’re doing both CI and Continuous Delivery and/or Deployment you’re doing “CI/CD”.

Let’s set this up with pretty much a single button click. You’ve got two options in Azure. In your Web App, either configure deployment from the “Deployment options” menu item or use the newer “Deployment Center (Preview)” which, as the name suggests, is still in preview.

Deployment options in Azure
Deployment options in Azure

Deployment options

It’s really so simple I’m not even sure if I should walk you through the next steps. Click the “Deployment options” menu item and simply fill in the options you want to use.

Deployment settings in Azure
Deployment settings in Azure

And that’s that. If you now browse to the Deployment options in your Web App you’ll see that Azure now shows all your commits. If you push a new commit to your repository you can even see the build status in Azure because Azure linked it to your VSTS account.

Continuous deployment in Azure
Continuous deployment in Azure

When deployment is done, browse to your Azure website (for me that’s imunique.azurewebsites.net) and behold, your website is up and running! Try changing something on the homepage, push it to Git, wait a few minutes, and see your website get updated.

Deployment Center (Preview)

My guess is that Microsoft is planning to replace the “Deployment options” with the “Deployment Center (Preview)”. If you have already set up CI/CD with the Deployment options you’ll see your commits in here too. For the purpose of this blog we’re going to “Disconnect” (button in the top menu).

Now go to the Deployment Center (Preview) and you should see various source control options, like VSTS, Github and Bitbucket. It’s a little more visually appealing than most Azure panels. So pick VSTS and continue. Next, you have to pick a build provider. I have zero experience with Kudu, so pick “VSTS Continuous Delivery” (which somehow isn’t the default).

Deployment center in Azure
Deployment center in Azure

In the next panel you can pick your VSTS account and repository. Unfortunately, at the time of writing this blog there seems to be a bug here which prevents me from picking an account. It worked in the past and the Azure team said they have fixed the problem in an upcoming release (which is pretty often)… So try it our yourself as this may work for you!

Anyway, if you continue, Azure creates a build pipeline and a deployment pipeline in VSTS! Now that’s pretty cool. It gives you complete control over your builds and deployments as you can tweak them however you like. At the same time the Deployment Center now shows you which builds have been deployed.

Pipelines in VSTS

So browse to your VSTS and check out the build. The user interface isn’t always intuitive, but you should be able to find it. You can edit the build and click around a bit. You shouldn’t need to change it, but you can.

VSTS build
VSTS build

The deployment is a little different. There’s just one build for every piece of code, but there can be multiple deployments, for example for your entire DTAP street (dev, test, acceptance, and production). Azure created a release with a single environment, “Dev”.

VSTS release
VSTS release

You can change the deployment in the “Tasks” for that environment (either click the Dev tile or select Tasks in the upper menu). The tasks look kind of similar to the VSTS build and can be (completely) different for each environment. Mostly, your tasks will be more or less the same and you’ll only have some values that differ per environment. You can use “Variables” for these different values, but that’s out of the scope of this post.

It’s also pretty easy to clone an entire environment (including it’s tasks and variables). Just hover over the tile of the environment you’d like to clone and a “Clone” button will show up. Click it and you get a new environment that’s exactly like the one you cloned. Further setup of builds and releases is out of scope of this post, but you can click around if you like.

When you’re done playing around simply delete the release and the build pipelines (in that order). Also make sure you disconnect the deployment in the Azure Deployment Center.

Deployment from Visual Studio 2017

Finally, let’s switch back to Visual Studio. You can do the exact same thing in Visual Studio. Right click on the project you want to release (which is your web project) and choose “Overview” in the drop down menu. In the page that opens go to “Publish” (in the menu on the left).

Here you can publish or set up your CI/CD. Click the “Start” button under “Continuous Delivery” and you can once again choose a VSTS subscription and an App Service (it creates a new App Service by default, but you can also select an existing one). Simply clicking the “Ok” button will have the same outcome as what we did in the Deployment Center. It will create a new build and deployment pipeline that is also visible in your Deployment Center. Pretty cool!

Going back to the Publish page in Visual Studio, you can also just publish directly. This is pretty useful if you want to release something to Azure right now. Simply click the “Start” button under “Publish”. You can now either create a new Azure App Service or pick an existing one. Just follow the wizard and your application will be published to Azure directly, without a build or deployment pipeline. Be careful though, your local build will be deployed, which will often be different than a build from the build server.

Wrap up

So there you have it! Deployment was never this easy! In just a couple of minutes we’ve created an application, put it in source control, and deployed it to the cloud, ready for the world to see (you can restrict access, but that’s out of scope). In the past it could take you hours to set up a new server (or days and even weeks if you still had to order one), but this alternative is a lot faster and cheaper.

The next step is using ARM (Azure Resource Management) templates to automatically create App Services and other Azure resources. Check out my post on ARM templates to get started.

Don’t forget to delete your App Service and your App Service plan or you’ll be charged for them!

Good luck and happy coding!

Maths in IT #2: Venn diagrams

Hey everyone, welcome back to part two of the Maths in IT series. I got a lot of positive response, so I guess I should just keep doing what I was already doing. This post will continue where part one left off, so if you haven’t read it I suggest you do so now before continuing.

  1. Maths in IT #1: Basic set theory
  2. Maths in IT #2: Venn diagrams
  3. Maths in IT #3: Algebra of sets
  4. Coming soon…

Here’s a quick cheat sheet with symbols I’ll use in this article:
Explicit definition: A = \{a, b, c\}
Implicit definition: A = \{x | \text{ x is a letter in the alphabet}\}
a is an element of A: a \in A
a is not an element of A: a \notin A
A is a subset of B: A \subset B
Empty set: \emptyset

Venn diagrams

As promised we’ll use this post to combine sets. Before we do that let’s take a look at how to visually represent collection. We can do this using a Venn diagram. A Venn diagram is about as simple as it gets (although they can be pretty complex too). Each set is represented by a circle. The diagram can illustrate relationships between the represented sets.

Let’s look at an example. We have two collections, A = \{a, b, c\} and B = \{d, e, f\}. Let’s show them in a Venn diagram.

A Venn diagram
A Venn diagram

When two sets have no shared elements, or for every x \in A goes that x \notin B, we say the sets are disjoint.

Now suppose A \subset B (A is a subset of B). We can show this in a Venn diagram and you’ll recognize it immediately.

A Venn diagram of a subset
A Venn diagram of a subset

When A is a subset of B then B overlaps A completely.

In the next sections we’ll combine sets and use Venn diagrams to visualize what elements we’re interested in.

Intersection

So suppose we have two collections, A = \{a, b, c, d\} and B = \{c, d, e, f\}. If I asked you which elements are in both A and B you’d answer c and d.

This is called the intersection of A and B.
We can write this as A \cap B.
In this example we can say A \cap B = \{c, d\}
More formally we say that A \cap B = \{x | x \in A \text{ and } x \in B\}.
And the symbol for “and” is actually \land, so to formalize it completely:

A \cap B = \{x | x \in A \land x \in B\}

Phew, that looks a whole lot like maths! You should read that as “the intersection of A and B is the collection of every x where x is an element of A and x is an element of B.”
And here is the Venn diagram, which visualizes this nicely. The intersection is the part in the thick black line where A and B overlap.

A Venn diagram of an intersection
A Venn diagram of an intersection

With an intersection we can give a formal definition of disjoint sets. When A \cap B = \emptyset then A and B are disjoint.

Furthermore we can say that for any collection A goes that A \cap \emptyset = \emptyset.
Also A \cap A = A.
And when A \subset B then A \cap B = A (check the subset Venn diagram).

Union

The intersection of two sets is the set of elements x where x is in A and B. Likewise, the union of two sets is the set of elements that are in set A or B. So when we have A = \{a, b, c, d\} and B = \{c, d, e, f\} then the union of A and B is \{a, b, c, d, e, f\} (no doubles).

We can write a union of A and B as A \cup B.
Like \land is the symbol for “and” \lor is the symbol for “or”. So the formal definition of union is as follows:

A \cup B = \{x | x \in A \lor x \in B\}

Read that as “the union of A and B is the collection of every x where x is an element of A or x is an element of B.”
In a Venn diagram the union is basically just both sets (the part in the thick black line).

A Venn diagram of an overlapping union
A Venn diagram of an overlapping union

Unlike an intersection, a union of disjoint sets is not an empty set (notice that both sets have a thick black line).

A Venn diagram of a union
A Venn diagram of a union

Now for any collection A goes that A \cup \emptyset = A.
And, again, A \cup A = A.
Also when A \subset B then A \cup B = B (check the subset Venn diagram).

Universe and Complement

When we’re talking about sets we’re usually not talking about that set in isolation. When I tell you that non-smokers live longer you understand that they live longer compared to people that do smoke. And you also understand that non-smokers and smokers combined make up for the worlds population. In this case we’re saying that we have a set of non-smokers in a universe of all people. We denote a universe with the capital letter U.

For every set A_1, A_2, A_3... A_n goes that A_n \subset U.
That makes sense as U represents all elements we wish to consider for our case. No collection can ever contain an element that is not a part of all elements.

In a Venn diagram we draw a universe as a rectangle in which all our sets are drawn. In the following example N is the set of non-smokers and U is the universe containing all people.

A Venn diagram of a universe
A Venn diagram of a universe

Now with the notion of a universe we can say we want all elements that are not in any collection A. We call this the complement of A and we write it as A^c. In the following Venn diagram the white part represents A^c.

A Venn diagram of a complement
A Venn diagram of a complement

We can now formally define the complement:

A^c = \{ x | x \in U \land x \notin A \}

It goes without saying, but \emptyset^c = U and U^c = \emptyset.
A little less obvious is that (A^c)^c = A. It makes sense though, as we first take everything that isn’t A (the complement of A) and then we take everything that isn’t in the resulting set, but that is A. Try drawing it in a Venn diagram and you’ll see what I mean.

A complement relative to a universe is called an absolute complement. A complement can also be relative to other sets. For example, when no universe is defined and we have sets A and B then the relative complement of A in B is the set of elements of B that are not in A. This takes the form of B \cap A^c or A \setminus B.

A \setminus B = \{ x | x \in B \land x \notin A \}

In a Venn diagram A \setminus B look as follows:

Venn10

Combining sets

We can now combine sets using intersection, union and complements. For example, let’s say our universe U is all living creatures on earth. Within U we have sets M, containing all mammals, B, containing all birds, and E, containing all animals that lay eggs. Formally:
U = \{ x | \text{x is an animal} \}
M = \{ x | x \in U \land \text{x is a mammal} \}
B = \{ x | x \in U \land \text{x is a bird} \}
E = \{ x | x \in U \land \text{x lays eggs} \}
Giving the following Venn diagram we can already draw some conclusions.

A Venn diagram with multiple sets
A Venn diagram with multiple sets

We can see that all birds lay eggs. Some mammals lay eggs too. No animal is both a bird and a mammal. As you see Venn diagrams can be really useful.
Now suppose we want the set of all mammals that lay eggs and also all birds. This is the collection (M \cap E) \cup B. That is, we take the intersection of M and E and union the result with B. In a Venn diagram we can see this collection (the red colored parts).

A Venn diagram with multiple sets
A Venn diagram with multiple sets

Now we can get every combination of sets using intersection, union and complement. It’s not always easy, but it’s possible.

Some code

As promised I’m going to keep things practical. So what’s the practical use of all this? Well, most languages let you work with exactly these functions!

For example, take a look at this SQL expression.

(SELECT 1
UNION SELECT 2
UNION SELECT 3)
INTERSECT
(SELECT 3
UNION SELECT 4)

What will that return? It returns only 3 because 3 is an element of both collections. This example also shows the UNION operator. Basically this is the set (\{1\} \cup \{2\} \cup \{3\}) \cap (\{3\} \cup \{4\}).

And SQL knows complement too.

(SELECT 1
UNION SELECT 2
UNION SELECT 3)
EXCEPT
(SELECT 3
UNION SELECT 4)

What happens here is that we have (an implied) universe U = \{1, 2, 3, 4\} and EXCEPT is the complement. So this query is basically the formula \{3, 4\}^c.
We could also say that U is not defined and this is the relative complement of \{3, 4\} in \{1, 2, 3\}: \{3, 4\} \setminus \{1, 2, 3\} = \{1, 2\}.

But what about C#? In my previous post we’ve seen HashSet<T>, but I’m not going to use that now. Instead I’m just going with a List<T> as LINQ provides various extension methods for working with sets. Notice that HashSet<T> has its own methods in addition to the LINQ methods.

List<int> a = new List<int>();
a.Add(1);
a.Add(2);
a.Add(3);

List<int> b = new List<int>();
b.Add(3);
b.Add(4);

List<int> intersect = a.Intersect(b).ToList();
List<int> union = a.Union(b).ToList();
List<int> except = a.Except(b).ToList();

You may wonder what that looks like in Haskell (since I’ve shown you Haskell the last time too), but it’s not really that different.

Prelude> import Data.List
Prelude Data.List> [1, 2, 3] `intersect` [3, 4]
[3]
Prelude Data.List> [1, 2, 3] `union` [3, 4]
[1,2,3,4]
Prelude Data.List> [1, 2, 3] \\ [3, 4]
[1,2]

I’ve actually needed this stuff in my day to day work. It’s not that hard and sometimes it’s an explicit business requirement. For example I needed all sales orders from The Netherlands, Belgium and Luxembourg (the BeNeLux). That’s really just a union! And I’ve needed intersect too, give me all Dutch customers that are not in some list of customers. And how about all non-Dutch customers? That’s just the complement!

Next time I’d like to continue with algebra and algebra of sets in specific. Does that sound like it will give you nightmares? Don’t worry, I’ll be gentle!

See you next time!

Maths in IT #1: Basic set theory

Welcome back everyone. Today I have a little something different for you. No new language or framework, but something that’s been around for millennia: maths.
When asking programmers about maths you’ll find two kinds of people, those who say you don’t need maths to be a good programmer and those who say maths is essential. Personally I think both are true. For some applications and industries you really don’t need advanced maths, but go into robotics, machine learning, statistics, or that kind of thing and you’re going to need maths, lots of it. And whether you need it or not, computers, programming languages and databases all wouldn’t exist without maths.
For now, let’s put it this way: knowing a thing or two about maths gives you an edge as a programmer!

Maths is everywhere: physics, chemistry, biology, economy and, yes, even in the arts! For this series I’ll focus on the maths we need in IT. I don’t know how much entries it’s going to have or what I’ll be discussing (and what I won’t be discussing), but I’ll be sure to keep it somewhat practical. No prior maths knowledge is assumed. Don’t worry, I’ll still post about code once in a while too!

  1. Maths in IT #1: Basic set theory
  2. Maths in IT #2: Venn diagrams
  3. Maths in IT #3: Algebra of sets
  4. Coming soon…

Collections

Do I need to tell you why we should study collections? You probably use them in your code every day in the form of arrays, database tables, lists or hash tables. What you probably didn’t know is that collections have a lot of mathematical theory! Since collections are very important in both maths as in programming I’m going to start this series here. More specifically we’re going to talk about sets.

A set is an unordered collection containing only distinct values.

Let’s look at an example of a set in real life. Do you collect anything? Maybe you collect old records, each record in your collection can be uniquely identified and doubles are for trading or selling. Also, no matter in which order you put the records on the shelf, it’s still the same collection of records. So a record collection is really a set.

Likewise we can have a collection of paintings or stamps. Another collection is an alphabet, for example the English, Russian or Greek alphabet. Using these alphabets we can construct a language. We have natural languages (like the ones I just mentioned) and formal languages. Programming languages like C#, Java, C or Haskell are examples of formal languages.

A collection in maths is usually indicated by a single suggestive capital letter, such as R for records, S for stamps or A for alphabet. If we have more than one collection we can use index notations to uniquely identify them: A_{1}, A_{2}, A_{3}

The notation for a single collection containing elements a, b and c is \{a, b, c\}. This is called the explicit definition. We can now declare a collection A as A = \{a, b, c\}.
And of course we can have a collection of collections: C = \{\{a, b\}, \{c, d\}, \{a, c\}\} . Notice that collection \{a, c\} is unique even though a and c are already elements in other collections.
The following set of sets isn’t valid: \{\{a, b\}, \{b, a\}\}. Because the order of elements in sets is ignored this set contains the same set twice, but sets also must have distinct values.
Now let’s say we have English alphabet E, Russian alphabet R and Greek alphabet G. The collection of alphabets is written as \{E, R, G\}.

Now we want to indicate that a is an elements of E. We do this using the following syntax: a \in E.
To indicate that \lambda (the Greek letter lambda) is not an element of E we use notation: \lambda \notin E.

We can compare collections. Collections are considered to be equal when they contain exactly the same elements, no more and no less. \{a, b, c\} = \{a, b, c\}, \{a, b, c\} = \{c, b, a\} (remember, order is ignored), \{a, b, c\} \neq \{d, e, f\} and \{a, b, c\} \neq \{a, \{b\}, c\} (the collection \{b\} does not equal b).

So far we’ve seen only explicitly defined collections. We can also implicitly define collections. This is especially useful when a collection contains too many elements to write down. For example a collection containing all countries on Earth. The notation for such a collection is as follows: \{x | \text{x is a country on Earth}\}. You should read that as “the collection consisting of all (objects) x for which x is a country on Earth”.
Generally we can say that an implicitly defined collection is in the form of \{x | P(x)\} where, in this example, P(x) is the statement that x is a country on Earth. Actually P(x) is a function taking parameter x and returning whether x is or isn’t a part of the set. We’ll look at functions in a later blog post.

To indicate how many elements are in any given (finite) collection we can use notation |A|. So |\{a, b, c\}| = 3 and |E| = 26 (where E is the English alphabet). Of course this is only possible when our collection is finite (we can count the elements).

If a collection is empty (it contains no elements) or some collection A = \{\} we can use a special symbol \emptyset. And of course |\emptyset| = 0 (an empty set has 0 elements).

When we allow the same element to appear in a collection more than once and we start taking the order of elements into consideration we’re speaking of a row. The syntax for a row is simply putting the elements next to each other. For example, using the set \{a, b, c\} we can make the rows a, ababc, baac, caab, but not baad (because d is not in the set).

Infinite collections

So far we’ve looked at finite collections.  Let’s look at some infinite collections now. Consider the collection of all numbers: 1, 2, 3, 4… In theory we can always add 1 to any number, so we can never stop counting. A few of these collections are so important that they got their own symbol.
First we have the natural numbers: \mathbb{N} = \{0, 1, 2, 3, ...\}.
If we add negatives to the collection we get all whole numbers, or integers: \mathbb{Z} = \{..., -3, -2, -1, 0, 1, 2, 3, ...\}.
If we also allow fractions, like \frac{1}{2} (a half) or \frac{1}{4} (a quarter) then we have the collection of rational numbers \mathbb{Q}.
Not all numbers can be expressed in fractions, for example pi (\pi, the surface of a circle with radius 1) or \sqrt{2}. When we want to include those numbers we get the collection of real numbers \mathbb{R}.

Now suppose we want all positive natural numbers, so 1, 2, 3… (excluding 0). We can indicate this with \mathbb{N}^+. Likewise, all negative numbers -1, -2, -3… can be indicated using \mathbb{Z}^-. And of course we can use \mathbb{Q}^+, \mathbb{Q}^-, \mathbb{R}^+ and \mathbb{R}^- to indicate positive or negative fractions and real numbers too.

And we can now define new infinite collections using implicit definitions. The collection of all even numbers, for example, can be defined as follows: E = \{ x | x \in \mathbb{Z} \text{ and x is even}\}.
So here E is a collection consisting of all x for which x is an element in \mathbb{Z} (all integers) and x is even.

Subsets

If a collection A contains elements that are all elements of another collection B we say that A is a subset of B. For example A = \{a, b, c\} is a subset of B = \{x | \text{x is a letter in the English alphabet}\} because a, b and c are all letters in the English alphabet.

More formally we can say that if A and B are both collections and for every x \in A applies x \in B then A is a subset of B.

We can write this as A \subset B. Since, in the previous example, A does not contain all letters in the English alphabet B we can also say that B is not a subset of A, this is written as B \not\subset A.

Given the above definition we can also say that A \subset A or A is a subset of itself. The empty collection \emptyset is a subset of all collections (including itself).

When a collection A \subset B, but A \neq B then we say that A is a proper subset of B. We may write this as A \nsubseteq B.
Sometimes you may see the notation A \subseteq B to indicate that A is a subset of B that may or may not be equal to B.
I’ll simply use A \subset B throughout the series).

When A \subset B and B \subset A then A = B.

In a similar manner we can say that B is a superset of A if A is a subset of B. This is notated as B \supset A (it’s the subset-symbol reversed). And of course we can say B \supseteq A to indicate that B is a superset of A that may or may not be equal to A. For a proper superset we may use A \nsupseteq B notation.

And when A \supset B and B \supset A then A = B.

Some code

I promised I’d keep this series somewhat practical. So let’s look at some code. Consider the following C# sample using your favorite .NET collection class.

List<int> list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(1);
int thirdItem = list[2]; // 0-based index.

The list now contains the items 1, 2, 3 and 1 again.  We can also get the item at the nth position, which is only useful when we know the order of the elements within the list. Based on that we can conclude that the List class in .NET is not a set. If we want a set in .NET we can use the HashSet class instead.

HashSet<int> set = new HashSet<int>();
set.Add(1);
set.Add(2);
set.Add(3);
if (!set.Add(1))
{
    Console.WriteLine("Item 1 couldn't be added.");
}
//int secondItem = set[1]; // Doesn't compile.

Because a set only contains unique items a hash of each item can be generated which means that lookup time for sets is much faster than that of lists (especially when the size of the list grows). The HashSet<T> can be compared to the Dictionary<TKey, TValue> class, but without values.

Unfortunately C# doesn’t know list generators. Working with infinite collections isn’t very do-able in C# either. Let’s say you’d like to get all positive even numbers, E^+ = \{ x | x \in \mathbb{N}^+ \text{ and x is even}\}. We have a few problems. First \mathbb{N} isn’t available in C#. We could take Int32.MaxValue (but that threw an OutOfMemoryException). So let’s just take all positive evens smaller than or equal to a million.

IEnumerable evens = from x in Enumerable.Range(1, 1000000)
                         where x % 2 == 0
                         select x;
List evaluatedEvens = evens.ToList();

Something like that is really the closest we can get in C# without going through a lot of trouble.

And at this point I stand corrected. Paulo Zemek pointed out that it’s actually pretty easy to work with infinite collections by utilizing the yield keyword. The next (Console Application) example illustrates this (although int will eventually overflow…).

static void Main(string[] args)
{
    foreach (int i in Evens().Take(10))
    {
        Console.WriteLine(i);
    }
    Console.ReadKey();
}

static IEnumerable Evens()
{
    return XToInfinity(1).Where(i => i % 2 == 0);
}

static IEnumerable XToInfinity(int start)
{
    int current = start;
    while (true)
    {
        yield return current;
        current++;
    }
}

That’s still a lot of typing though…

Let’s take another language that solves these kinds of problems a little better, a language that is a little closer to actual maths, Haskell.

evens = [ x | x <- [1..], x `mod` 2 == 0]

And there you have it. Notice that the code is actually pretty close to the mathematical notation. Take each x where x is an element of [1..] (one to infinity) and where x is even.
We can easily take the first 10 items of this infinite collection without crashing the program or encountering out of memory exceptions.

*Main> take 10 evens
[2,4,6,8,10,12,14,16,18,20]

This isn’t a blog on Haskell, so I can’t really expand on what it does, but I can say that Haskell is a “lazy” language, which means it doesn’t evaluate the items in the list until you actually need them. In this case it will only evaluate the first ten items of evens, keeping it from hogging our memory and CPU.

That’s it for now. We’ve had a very basic introduction to sets, but if you’re not familiar with this stuff it gets hard pretty quickly. So let’s take it nice and easy. Next time we’re going to combine collections and take a look at the Venn diagram.

Hope to see you next time!

How to learn and polyglot vs. specialist

Welcome (back) to my blog. This weeks blog is going to be a little different than my other blogs. Unfortunately I’m a little short on time this week, too short to write up a small web page using some cool technology and write about it too. However, I still want to deliver you the blog you may have been waiting for.
If you read my blog every week, or at least regularly, please drop a line in the comments and let me know, I greatly appreciate it!
I also Twitter other articles that you might like @sanderrossel, so be sure to follow me if you haven’t done so already.

So what do I have in store for you this week? Well I recently got a question from someone who asked me what’s the best way to learn a new technology. I’ll answer that question at the end of this article, but first I’ll tell you why I think you need to learn new technologies from time to time.

Specialized vs. polyglot

One of the reasons I want to write about why you need to learn and how to learn effectively is because lately I’ve been in a few discussions with people about being a specialized programmer versus being a polyglot, or all-round programmer.

A few months ago I switched jobs. One of the reasons for me was that my previous employer wanted everyone to specialize. My specialization would be C# WinForms and Entity Framework, while other colleagues specialized in SQL Server. Sounds nice, right? Except that, next to a C# certificate, I also have a SQL Server certificate that wasn’t going to see any use in that setup… Besides, I had seen quite a bit in C# WinForms and I wanted to move on.
So I applied for another job at another company. Their motto is “we specialize in being all-round.” Put differently, everyone does everything. At one time you may be working with C# and SQL Server, while at another time you may be working with Java and Oracle. Now that’s something I could get into!

Let’s first define these terms, specialized and all-round, or polyglot, and see why you would want to be the one or the other.

I don’t think many people will disagree with me when I say a specialist is someone who has gathered a large amount of knowledge and experience in a specific field. The more specific this field the more specialized you are. Your expertise may be C#, which is a little broad, hardly an expertise actually. C# WinForms with Entity Framework, my specialization, is more specific. You could be as specific as specializing in a single Control, for example the C# WinForms DataGridView. Now that’s a specialization.
I recently read the book Soft Skills by John Sonmez from Simple Programmer. It’s a great book that I can really recommend. John talks in length about why you really need to specialize. In short, if someone has a very specific problem and you specialize in that specific field you’re worth a lot to that someone. The more you specialize the smaller the chances you’ll find someone with the exact problem you can solve, but that also means that the people who have such a specific problem will have trouble finding someone to fix it for them. Therefore, once these people find you (or you find them), and you are one of the few who can really solve their problem, they are going to want you, no matter the costs (to some degree).

So how about all-round, or polyglot. You may have heard about the polyglot programmer and how amazing they are. But what’s a polyglot anyway? The dictionary describes a polyglot as someone who speaks several languages. So a polyglot programmer is someone who knows his way around multiple programming languages. Let’s extend that a bit and say a polyglot programmer also knows multiple paradigms (Object Oriented Programming, Functional Programming,  Aspect Oriented Programming…) and probably lots of libraries too, because different languages require different libraries.
The good thing about being a polyglot is that each new language or technology is easier to learn. Is this a managed language? You’ve seen it before. Is it unmanaged? You’ve seen it before. C or BASIC syntax? You’ve seen it all before! So, in theory, you could work for just about anyone. Except that you’re probably not very good at all of these languages, so why would someone hire you over someone who has the exact knowledge they need? But when you think about it, how often do clients know exactly what they want or need.
On the other hand, once systems become real complex and need different languages, databases and paradigms in different parts of the system, well, you’re good to go.

It seems the world is pretty divided on which is best. As said, John Sonmez really urges you to take on a specialization, and it could really work out for you. But if you Google for “polyglot programmer” you’ll find websitesblogs and videos dedicated to that subject as well (both for and against)…

Why not both?

So here’s my view on both specialized and polyglot programming. They’re both great and everyone’s right. I’m usually not that politically correct, so let me explain.

Expertise is great, right? When you’re doing any kind of serious development you’re going to run into issues that can only be properly solved if you really know your stuff. For example, in WinForms you may need to create your own user control. That’s really specialized work, you really need to know the design time implications associated with creating controls. As coincidence would have it just last week I inherited a Form with a DataGridView on it. Upon setting some property I populated the DataGridView with columns. What happened at design time? Everytime I did a build or I opened the WinForms designer it would re-populated the DataGridView, but didn’t actually throw away the old columns! Before I knew it I had over 100 columns on my Form! So apparently I’m no specialist in WinForms Control development. Luckily I was knowledgeable enough to recognize the problem and fix it right away. How did I get that knowledge? You may not like the answer, but I’ve been building non-trivial WinForms apps for over four years and I’ve been reading up on books and blogs dedicated to WinForms development. Could I have solved my problem without knowing what I know? Sure, just duplicate a lot of code in every Form I have so I don’t need Form inheritance. Or maybe make a fix for the bug, which then requires another fix, etc. etc… And I’ve seen these ‘solutions’ lots of time!

So if you need four years to learn any language and/or framework to be able to build non-trivial applications with it there’s not much sense in being a polyglot, right? That’s just going to take ages and by the time you’ve mastered your second language your knowledge on the first will be outdated. Well, here’s the trick, you don’t have to be that sufficient in any language. I get my daily work done in either Visual Basic.NET or C# with WinForms, and more recently ASP.NET MVC, Entity Framework and SQL Server, and I’m pretty good at all of them. You could say that’s my specialization and it comes kind of natural as I’m working with them every day from 9 to 5. Once in a while I may need to read up on some stuff at home, and you need to keep challenging yourself, but overall, when you’re working with a technology that much you tend to get a little good at it.
To become truly specialized, an authority in your line of work, you’d need to put a lot of free time in it as well. But what if you spend that free time on learning other technologies? Well now we’re going a little polyglot while also being a little specialized! And here’s the good part, being a polyglot will make you a better specialist!

Wait, what? That sounded pretty contradicting. Let me put it this way. When all you’ve ever known is C# are you truly aware of its strengths and weaknesses?
I’m currently learning Haskell, a functional language, for my university study. When I saw the Haskell implementation of the quick sort algorithm that made me question all I ever knew about C# and object oriented programming. You may be familiar with the C# implementation, or you could maybe guess what it looks like. I can’t post it here, because it’s simply to big, but a quick Google search will give you lots of different implementations, like Iterative Quick Sort (haven’t checked if it’s actually the best out there). Beware that a lot of what you’ll find is recursive, which will give you a StackOverflowException in C# for large lists.
Now here is the Haskell implementation.

qsort []      = []
qsort (x: xs) = qsort smaller ++ [x] ++ qsort bigger
                   where
                      smaller = [a | a <- xs, a <= x]
                      bigger = [a | a <- xs, a > x]

You don’t actually have to understand it, but the takeaway is that this Haskell implementation is so much more concise, and ultimately readable, than any C# implementation you can come up with.
But there’s more. Are you using .NET LINQ? Well, where did you think that came from? Yes, the functional programming paradigm! Functional languages are also pretty good at multi-threading because values are immutable.
Does that make you want to throw C# out of the window and do everything in Haskell (or, since you’re a .NET dev, F#, or Scala if you’re a Java dev) from now on? No, and neither should you. But you should see what’s out there and if there’s some language or technology that solves your problem better than the technology you’re (ab)using now. Even if there isn’t you’ll have gained new insights that might help you on your current job. Just by taking a more functional approach in your object oriented language can help you in writing better and more readable code.

How about another example? So as said I do quite a lot of work in SQL Server. I have plenty of experience in developing databases (I even got certified) and some experience in maintaining them (I’m what they call an accidental DBA). SQL Server was just that one tool that could solve all my problems. And then I spend just a few evening with MongoDB (I wrote a blog about that, A first look at NoSQL and MongoDB in particular). NoSQL just blew my mind. All the troubles I had with SQL Server (and SQL in general), that I took for granted, had been solved by other databases. Those few evenings with NoSQL gave me another perspective on SQL and, I dare say, made me a better SQL programmer because I was now much more aware of SQL’s strong and weak spots.

One last example. I started out as a Visual Basic programmer. Many of you now shudder in disgust. I never understood what all the VB hate was about. Sure, a lot of bad VB has been written, especially pre-.NET VB. But I’ve seen equally bad code in C# as well. And nowadays VB does almost everything C# does and vice versa. It’s just a different syntax. Now this may come as a surprise to some, but not all languages use curly braces! VB is one such language, and because I know VB learning another such language may be easier. But there’s more. In VB there’s a thing called “Option Strict”, which is set to “Off” by default. This makes VB a lot more “forgiving” than C#. For example, the following code would run fine (except for the call to MethodThatDoesNotExist() which gives a runtime exception).

Dim s As String = 42
Dim i As Integer = "42"
Dim o As New Object()
o.MethodThatDoesNotExist()

You may not like it, but this taught me a lot about implicit and explicit casting and late and early binding (which helped me get into JavaScript later). And on the subject of late binding, there’s times when it’s pretty handy. In fact, C# introduced the dynamic keyword for it in .NET4.0. Well guess what, as a VB programmer I was way ahead of you! Optional and named parameters, introduced in C# in 2010, came straight from VB. And what about those new cool Exception filters in C# 6.0? VB has had them for years.
Of course the same can be said for C# features that were introduced in VB, like Iterators (yield), but the takeaway is, that even these two very closely related siblings take stuff from each others, and knowing one can make you better at, or more prepared for, the other.

So why not spend a few hours with some new language once in a while? Just get something up and running in a language you’ve heard about, but don’t know yet, use a framework you haven’t used before, or get a taste of some NoSQL database. Make sure you pick one that’s popular as you’ll be able to find lots of documentation.

How to learn

So I may have convinced you that learning new languages, paradigms or frameworks can be fun and that it can actually help you get an edge as a programmer. But how do you learn a new language or framework? I mentioned Soft Skills before, but I am going to do so again. In this book John Sonmez has a ten step plan for learning just about anything as quickly as possible. I know it works, because John’s approach to learning closely resembles my own method. So here’s what to do.

Read up on the topic you’re interested in. Get a feel of what’s important and what you need at the very least to get started. Also get a feel of how big this thing is you’re learning. Is it a few days of work, or will years of study go into this? Don’t spent to long on this though.
Once you have a picture of what it takes to become proficient at whatever it is you’re trying to learn determine what parts you want to do. Set some bigger goal you want to reach and some smaller goals that will take you there.
After that find resources, books, blogs, videos, whatever it takes to get your started. And then filter them so you can get started with reaching your first goal.
For most languages and frameworks my first goal is to set up the environment. Once I’ve done that I need some “Hello, world!” kind of application, so I know I can actually write and run code with it. Usually that’s not too difficult with all the pre-reading I’ve done.
After that just play around a bit, get familiar with the syntax, the libraries, the tools, etc. After that you can do whatever you want. I’d repeat those last few steps, keep setting new, small goals, and just get there.
Don’t spend to much on any tool you won’t be using though. Just get a feel for it and see how you can use certain elements from it in your day to day programming.

 

That’s my approach to learning. John does a much better job at explaining it than I do, so I suggest you buy his book, or order his course 10 Steps to Learn Anything Quickly.
Now John’s 10th step, one I didn’t have, is rather interesting. Teach! Of course I’ve been doing that by writing my articles, but I’ve never thought about it as a part of the learning process. So how can you teach? Well, start a blog! And, as you might’ve guessed, John has a (free) course on that as well. If you’re interested in learning and teaching I suggest you sign up for his three week email course on blogging.

I’ve successfully used this approach (including teaching) while learning web development. Try learning that, “web development”. No way you can pull that off! Web development is such a broad topic you’ll be overwhelmed, discouraged, and, ultimately, you’ll probably fail. Unless you start by learning HTML, then learning CSS, etc. Just take it one step at a time.
And you can see how I did this by reading up on my series on web development. In eight easy lessons you’ll have learned the ‘impossible’ task of “learning web development” and it wasn’t even hard.

I did the same with NoSQL. First I needed to know what NoSQL is. So apparently there’s different ‘flavours’. Then I needed to pick one, so I picked the one that’s closest to SQL (easy starting point!). Then I picked a database, MongoDB was listed as most popular, so that means I can find a lot of stuff about it. Then I needed to know how I can install it, after that run it, then try to insert some data, read it, edit it, read it again, delete, etc. And finally try doing that from C#, first connect, then read, then edit. Step by step. I’ve blogged about the result, A first look at NoSQL and MongoDB in particular.

There you have it. Get started now. Pick up that language you’ve been interested in, but that you never got around to doing. It might benefit you in more ways than you know. Now is as good a time as any. And on that subject, John’s book Soft Skills also talks about how to stop procrastinating and just do what you should be doing (actually the book talks about a lot).
Next week I’ll be back with hopefully a ‘normal’ technical article again. So stay tuned!

Happy coding!

Web development #8: Where to go from here

So if you’ve read all of my previous posts and you’ve made it this far congratulations! You’ve now learned the basics of web development and you have the knowledge to create awesome websites. To actually create awesome websites you need skills and skills come from practice and experience. I’ve only touched the surface in this blog series. So it’s now up to you to get your hands dirty and write more HTML, more CSS, more PHP and more JavaScript. And while doing that Google is your friend! I’ve far from discussed all the possibilities (people write entire books about that), but at least you know all the moving parts. So these previous blogs weren’t so much about making you a pro, they were about getting you up to speed in a simple manner. The rest, unfortunately, is up to you. In case you’ve missed some posts, here they all are again.

  1. Web development #1: Internet and the World Wide Web
  2. Web development #2: Our first website using HTML
  3. Web development #3: Styling our page with CSS 3
  4. Web development #4: PHP in the back
  5. Web development #5: User input with HTML Forms
  6. Web development #6: Getting interactive with JavaScript
  7. Web development #7: Dynamic page updates with AJAX
  8. Web development #8: Where to go from here

So what’s left for us? Well, in this post I’ll write about some stuff I haven’t written about in the previous posts, but which every web developer should know about. After that I’ll lay out some alternative technologies for you that may help you get started with the technologies you want.

Debugging

So you’ve written your page, you test it in your browser, and for some reason it doesn’t do what you want it to do. How can we find our error? The examples I’ve given were pretty small and in those cases it may be feasible to just have a good look at your code again. However, when you’re going to write large applications with much more code on both front- and back end just looking at your code isn’t going to help you. I haven’t discussed debugging your code because a lot of it depends on your environment. In this series I’ve used Notepad++ which doesn’t have any debugging capabilities (although I read there’s some plugin that let’s you debug code, I haven’t tried it though). If you’re going for an IDE (Integrated Development Environment) such as Visual Studio, Eclipse, NetBeans or XCode you’ll get a lot more possibilities. You can set breakpoints, for example, which allows you to pause your software on a certain line of code and inspect variables and their values and then even step through your code line by line to follow the flow of your code. Personally I work with Visual Studio and it allows you to see the entire stack and even edit code at run time.

But that’s all back end debugging. What if something is wrong with your CSS, HTML or JavaScript? Luckily all major browsers (and probably the non-major too) have debugging support. If you’re in IE, Firefox or Chrome press F12 and you’ll get the developer tools (alternatively you can look them up in the browser’s menu). So here you should see a tab or button that gets you to the console. In the console you’ll see any errors or warnings that are generated by your page (invalid HTML, a bug in your JavaScript, etc.). You can also log to the console yourself using console.log() in your JavaScript (never ever use that in production code though). There’s also a tab called Network where you can see all server requests from your page. This comes in handy when pages load slow, perhaps you’re making a lot of requests or you’re loading some large file that takes a while to load. There’s also a tab where you can see your page’s HTML and CSS and edit them real-time (in your browser, not on the server). You can either select an element in the DOM and have it light up on your page or select something on your page and have it highlighted in the DOM. Then you can make changes to your HTML and CSS and see the results real-time. It’s also possible to debug your JavaScript. You can set breakpoints and step through the code following the execution flow and inspecting your variables. Pretty neat and indispensible when working on your pages! Try working with the developer tools in your browser of choice and look for some tutorials.

Picking a back end language

In this series I’ve used PHP. PHP is free (although most languages are nowadays), easy to start with and supported everywhere. You can simply open up Notepad(++), start typing PHP, put it on your server and it’ll run. Compare that to other (compiled) languages like Java and C# and PHP is an absolute winner. A lot of popular Content Management Systems (CMS), applications that help in creating, editing, publishing and maintaining content on your websites, such as WordPress, Joomla!, Drupal and Magento, have support for PHP too. So PHP is a good choice for many applications.

However, a lot of people prefer their languages more strongly typed and object oriented. In that case you might go for Java or C# (or Visual Basic). So suppose you want to go for C# because perhaps you already have experience in WinForms or WPF or a client wants a .NET application. So when using C# you’re basically using the .NET Framework and when going for web development you’ll be using the ASP.NET stack. But then in ASP.NET you’ll have some options like WebForms and MVC. Let’s go with ASP.NET MVC, because that’s a good choice for modern web development. ASP.NET MVC makes use of the MVC Design Pattern. MVC stands for Model View Controller. When requesting a page ASP.NET MVC basically calls a method on a class. This class is called the Controller. Perhaps your Controller makes some database calls and does some computations and then comes up with the data that you want to show on your page. This data is just another class and represents the Model. The Model is then passed to your View, which is basically your HTML, which is then returned to the client. And, like PHP, with C# (or Visual Basic) you can generate HTML/View using the Razor Engine.
So you want to get started with ASP.NET MVC? I don’t blame you, it’s a great product. I recommend you get the Visual Studio Community Edition for free and just start! There’s plenty of tutorials on the web, but if you’re looking for a more structured course on MVC I can recommend the book Professional ASP.NET MVC 5.
And here’s a little downside to .NET compared to PHP. Once you have your software ready for production you’ll need a server with .NET installed that’s also running some special server software called IIS (Internet Information Services).

Another alternative to PHP and C# is Node.js. Node.js is relatively new and is a fast and lightweight platform that allows you to run JavaScript in your back end and create real-time web applications. So you can use JavaScript in your back end, which is pretty neat because that means you can re-use code from your back end in your front end! Try doing that using any other back end language. Other than that Node.js uses sockets, which enables it to send data to your client without that client asking for it. Usually a client sends a request and the server serves. But now we can serve without the client having to request! That allows us to easily create, for example, a chat application. Our client simply subscribes to the chat and our server sends every new message to the client as soon as it’s received. So when going with Node.js you probably want to use Express as well. Express is a JavaScript framework for Node.js which makes working with Node.js just a bit easier. And when you want to start using sockets extensively you might want to check out Socket.IO, which is a library for working with sockets from JavaScript. And of course you’ll need to generate your HTML in Node.js. There’s a few options for you, but Jade is a pretty popular one.
So you may have figured out some of the downsides of Node.js. First of all, it’s JavaScript, which may or may not be a problem for you, depending on your tastes. Second, unlike C#, Node.js doesn’t “just work”. To get any serious business done you need quite some external JavaScript libraries (and there’s A LOT as we’ll see in a bit). The pro, of course, is fast, relatively easy, real-time web apps using the same language as your front end.
If you’re interested in Node.js you may take a look at what’s called the MEAN stack, MongoDB, Express, AngularJS and Node.js. It’s free, open-source and very JavaScript.

I should probably mention that .NET has their own sockets framework that allows you to build real-time web apps easily, called SignalR.

So we’ve looked at some alternatives to PHP. There are more, like Ruby (On Rails), Python and Java. You can check them out, read a bit about them, and decide what works well for you.

Some front end alternatives

So we’ve looked at some back end alternatives, but what can you do on the front end? On the front end it really all comes down to HTML, CSS and JavaScript. Your HTML is generated on your back end and there’s plenty of options to do it, like Razor, Jade or any other HTML generator. It all depends on what back end you pick.

So what about CSS? Well, browsers really require CSS to lay out your page. There are some alternatives though, most notably LESS.
LESS looks a lot like CSS, but adds some features. You could almost call it object oriented CSS.
Another alternative is Stylus, which, like LESS, adds features to CSS. Stylus is focusing more on expressive CSS, which is easy and clean to read while also supporting some form of object orientism.
There’s more, like Sass and Turbine.
Now here’s the deal. None of them replace CSS, rather they are compiled to CSS. So you write your LESS (or any other), compile it, then use the generated CSS on your page. This adds some hassle, since you need to compile your code (your not-quite-CSS) before you can actually see it on your page (as opposed to just refreshing your page). But they also make for clean, maintainble CSS. I recommend checking out at least one of them, especially when you’re going to build larger websites. Alternatively you can just use an already existing library, such as Twitter, which I’ll talk about in a bit.

What about alternatives for JavaScript? There are quite some languages that compile to JavaScript. The two most notable are probably TypeScript and CoffeeScript though.
When you read the first lines on the CoffeeScript page you’ll pretty much have an idea what CoffeeScript is all about, “JavaScript has always had a gorgeous heart. CoffeeScript is an attempt to expose the good parts of JavaScript in a simple way”. So there isn’t much to say about that. It’s just a new syntax for JavaScript, hiding the dirty bits. I haven’t used it myself, but if you don’t quite like the JavaScript syntax and want to try something that’s like JavaScript, but prettier you might want to check out CoffeeScript.
Now TypeScript, that’s quite something different. It adds type safety to JavaScript and actually reads more like C# than JavaScript. Not completely by coincidence as TypeScript was actually created by Anders Hejlsberg, lead architect of C#. Of course it still just compiles to JavaScript. If you’re already in the Microsoft stack and using Visual Studio you may as well give TypeScript a try!
I also want to mention Dart very briefly. It was created by Google and it’s a fully object oriented way of writing JavaScript. In their words “new, yet familiar”.

Libraries and Frameworks

When working with JavaScript you know everything is possible, but nothing is easy. Luckily a whole lot of people have created libraries for you that you can use when using JavaScript. In this section I just want to point out some popular libraries and frameworks. We’ve already seen jQuery and jQuery UI.
Another very popular framework is Twitter Bootstrap. It’s mostly CSS, but has some JavaScript too. It allows you to create pages that look good and scale well across devices with relative ease. It mostly depends on tagging your HTML elements correctly. I’m not going to discuss it any further here. Just know that it’s available and that it’s widely used.
Another popular library is Knockout. With Knockout you can bind your JavaScript classes to your HTML elements. So values are automatically synchronized between your HTML and JavaScript. If a user changes the value of a text field the underlying value is changed and is reflected on your page and vice versa. Again, I’m not discussing it further, just know that it exists.
Another library that you simply cannot ignore is AngularJS. AngularJS is an MVVM (Model View ViewModel) framework for building Single Page Applications (SPA). That means you get a single web page and all data is fetched using AJAX. It makes for a fluent user experience as the website doesn’t need to refresh with each request (only parts of it). AngularJS is BIG. It basically does most that jQuery does and everything that Knockout does as well and the learning curve can be steep. Luckily there are some nice tutorials and books around.
Now one of the most awesome JavaScript libraries you’ll come across is D3.js. If you need any kind of graph or table, or any visual to represent your data, D3.js is the library you need. Just look at it. The website features many examples and it’s fun to just look at it. The only thing I don’t like about this library is that I haven’t needed it yet 🙂
You might come across Ember.js as well. It’s an MVC framework for creating Single Page Applications.
Without going into detail, here are some other popular JavaScript libraries: Underscore.js, Backbone.js, MooTools, jQuery Mobile, Modernizr

There’s literally thousands of JavaScript libraries and frameworks. Some are pretty big, like Angular, and some are really small and do just one thing only, but do it really well. You might want to check out Microjs, a website with literally hundreds of very small JavaScript files. Just look around and see what’s available, it might surprise you.

Some final words

So in this final post of my web development series we’ve looked at some alternatives and libraries you can use to help you create awesome websites. There’s still lots of stuff that we haven’t covered, like putting your website in production (because that really depends on the languages you used and where you’re hosting), security (very important!) and SEO, or Search Engine Optimization. We’ve also skipped databases entirely.
We did have a look at all the parts that are vital in web development though. You should now have a pretty good idea of what you need to create your own websites.
In this series I have pointed out some books from the SyncFusion Succinctly Series, and I’m going to do so again. You can subscribe freely and gain access to free books on JavaScript, Twitter, Knockout, AngularJS, Node.js and much more. All I can say is that it’s really worth it!
For more (less succinct) books on various topics, including a lot of web development, I can recommend Manning Publications. They have some good stuff on Node.js, D3.js, SPA Design, CORS, and more.
Two other really cool articles/projects I came across are Learn JavaScript by Dave Kerr, where he creates a Space Invaders game using JavaScript and a Mario game by Florian Rappl. Both are worth checking out (and if you don’t like their articles you can still play the games 😉 ).

So that’s it for this series. That isn’t to say I’ll stop blogging or I’ll stop writing about web development, it just won’t be for this series. I hope you enjoyed it as much as I did. Any comments and questions are more than welcome!

Thanks for reading.

Happy coding!

Using DataSets to work with (relational) in-memory data

After a previous blog, Using C# to connect to and query from a SQL database, I got a request to write a bit more about the SqlDataAdapter and DataSets. So here it goes.

Looking at the DataSet

So what exactly is a DataSet? Besides it obviously being a class in the .NET Framework the DataSet represents an in-memory cache of data. This data can be stored in multiple DataTables. DataTables contains DataRows, which are made up of DataColumns. DataTables in a single DataSet can be related to each other by what can best be described as foreign key relations, the DataRelation is used to establish such a relation. Additionally DataTables can contain contraints such as UniqueContraints and ForeignKeyContraints. In case you hadn’t noticed yet, this is the relational model we know from SQL databases!
So yes, you’d think that working with DataSets and SQL kind of go hand-in-hand and it is in fact quite easy to fill a DataSet using the SqlDataAdapter. We’ve seen this in the previous blog post, but let’s quickly take a look at it again.

DataTable table = new DataTable();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(table);
}

Notice that I’ve used a DataTable instead of a DataSet. In this case all I’m getting from the database is a single resultset. If I would’ve used a DataSet the DataAdapter would’ve filled it with one DataTable, the one we have now. So in the next example I’m going to use a DataSet and do something with the data we get.

DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
}

foreach (DataTable table in set.Tables)
{
    foreach (DataRow row in table.Rows)
    {
        Console.WriteLine("ID: {0}, Name: {1} {2}.", row["ID"], row["FirstName"], row["LastName"]);
    }
}

Console.ReadKey();

Notice that I loop through the tables of the DataSet and through the Rows of the DataTable. the row["ColumnName"] accesses the value (boxed in an object) of the column in that row. Obviously if I had two tables, the other not being a Person, this code would break.

Accessing multiple tables and schema’s

So in the next example we’re going to load multiple tables into our DataSet. By default, the DataSet creates tables with the name “Table”, “Table1”, “Table2”, etc. If we want to access our tables by name rather than index, like we do with our columns, we’ll have to name them ourselves. You’ll see this in the next example.

DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntityrn" +
    "SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
}

foreach (DataTable table in set.Tables)
{
    if (table.TableName == "Person")
    {
        foreach (DataRow row in table.Rows)
        {
            Console.WriteLine(String.Format("ID: {0}, Name: {1} {2}.", row["ID"], row["FirstName"], row["LastName"]));
        }
    }
    else if (table.TableName == "BusinessEntity")
    { //... 
    }
}

Console.ReadKey();

And even better would be to not loop through tables at all, but simply access the table you want by name.

DataTable personTable = set.Tables["Person"];
foreach (DataRow row in personTable.Rows)
{ //...
}

As you can see we’re querying for two tables (the first only having the ID column) and the second being our Person. It’s also possible to create your DataSet without having to query for data. This can be done by using the FillSchema method of the DataAdapter. The following will result in the same tables we had in the previous example, but without data.

DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntityrn" +
    "SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.FillSchema(set, SchemaType.Source);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
}

Of course the SqlTableAdapter still has to do a roundtrip to the database to get the schema information. It will send the following query to get the schema (I’m not sure if it’s the same for all versions of .NET and/or SQL Server).

SET FMTONLY OFF; SET NO_BROWSETABLE ON; SET FMTONLY ON;SELECT BusinessEntityID AS ID FROM Person.BusinessEntity
SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person
SELECT BusinessEntityID AS ID, AddressID, AddressTypeID FROM Person.BusinessEntityAddress
SELECT AddressID, AddressLine1, AddressLine2, City, PostalCode FROM Person.Address SET FMTONLY OFF; SET NO_BROWSETABLE OFF;

Getting relational

I mentioned that DataSets can also have relations between tables. Unfortunately you’ll have to add those yourselves. Why would you want to do this? Because you can now navigate from a row in one table to child rows in other tables. BusinessEntity and Person are related, one BusinessEntity represents one Person. However, a BusinessEntity can have one or more addresses through the BusinessEntityAddress table which has a relation to Address. Let’s load all those tables into our DataSet, create the appropriate relations and display the names and addresses of our business entities (note that I’m only printing the ID’s and names of people who have at least one address). Remember that in order to create a relation all rows from a child table need a row in their master table (just like in the database). This is especially tricky when you start using filters. For example getting ONLY Persons with FirstName like ‘A%’, but getting ALL addresses will result in addresses without a person (or a child without a master), which prevents you from creating a relation.

DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntityrn" +
    "SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Personrn" +
    "SELECT BusinessEntityID AS ID, AddressID, AddressTypeID FROM Person.BusinessEntityAddressrn" +
    "SELECT AddressID, AddressLine1, AddressLine2, City, PostalCode FROM Person.Address", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
    set.Tables[2].TableName = "BusinessEntityAddress";
    set.Tables[3].TableName = "Address";

    set.Relations.Add("BusinessEntity_Person", set.Tables["BusinessEntity"].Columns["ID"], set.Tables["Person"].Columns["ID"]);
    set.Relations.Add("BusinessEntity_BusinessEntityAddress", set.Tables["BusinessEntity"].Columns["ID"], set.Tables["BusinessEntityAddress"].Columns["ID"]);
    set.Relations.Add("BusinessEntityAddress_Address", set.Tables["BusinessEntityAddress"].Columns["AddressID"], set.Tables["Address"].Columns["AddressID"]);
}

foreach (DataRow entityRow in set.Tables["Businessentity"].Rows)
{
    foreach (DataRow personRow in entityRow.GetChildRows("BusinessEntity_Person"))
    {
        foreach (DataRow entityAddressRow in entityRow.GetChildRows("BusinessEntity_BusinessEntityAddress"))
        {
            foreach (DataRow addressRow in entityAddressRow.GetChildRows("BusinessEntityAddress_Address"))
            {
                Console.WriteLine("The entity with ID {0} has name {1} {2} and address {3} {4} {5}.",
                    entityRow["ID"], personRow["FirstName"], personRow["LastName"],
                    addressRow["AddressLine1"], addressRow["City"], addressRow["PostalCode"]);
            }
        }
    }
}

Console.ReadKey();

Now that’s quite a bit of code… I’m using a lot of (magical) strings to illustrate how it works. I suggest you create some constants or something. Don’t duplicate strings like that.
So take a good look at that example and let it sink in. You’ve just loaded your database, including relations, in code! That’s pretty neat! But it gets even better!

CRUD operations using DataSets

So we’ve looked at reading data, but what about creating, updating and deleting data? To do this you need to set the Insert, Update and DeleteCommands of your DataAdapter and call the Update method. You can do this manually, simply create three SqlCommands, one with an INSERT statement, one with an UPDATE statement and one with a DELETE statement. You can do with less if you know that certain actions are impossible, for example records from a certain table could never be deleted.
You can also use the SqlCommandBuilder, like I will do in the next example. The SqlCommandBuilder can create the insert, update and delete commands for you based on your (single table!) select query. So make sure you use the same select query in both your actual select and your update routine! The SqlCommandBuilder needs to get the schema of your table and does this using your select query. That also means it needs to make a roundtrip to the database (the same we saw with FillSchema). So when you really need that performance, or if you want to do things ‘right’, set your commands manually.
Another caveat when updating records is that you can only update one table at a time and it will do so row by row. If you update a DataSet and don’t specify a table the Update method will use the first table in the DataSet.
Before updating you can call HasChanges (to prevent going to the database if you have no changes) and GetChanges, which returns a new DataSet containing only changed DataTables and DataRows. When you do you’ll need to explicitly call AcceptChanges on your original DataSet to indicate the changes were successfully saved to the data source. Alternatively you can call RejectChanges to restore the DataSet to its original state (since the last Accept- or RejectChanges was called or since it was created). You can also accept or reject changes on DataTable and even DataRow level.
That’s A LOT of information, let’s look at some code!

DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person ORDER BY ID", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(set);
}

set.Tables[0].Rows[0]["FirstName"] = "Sander";

if (set.HasChanges())
{
    DataSet changes = set.GetChanges();
    using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
    using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person ORDER BY ID", connection))
    using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
    using (SqlCommandBuilder builder = new SqlCommandBuilder(adapter))
    {
        adapter.Update(changes);
        set.AcceptChanges();
    }
}

You can insert and delete rows in the same manner. Let’s look at inserting a record. In this case we always need to insert a BusinessEntity or we’ll create a Person without a BusinessEntity. This is actually more complicated than it sounds. First, we’ll need all the columns that do not allow NULL values and set values. Second we need to let our DataSet know that there’s a relation between BusinessEntity and Person. We’ll also need to let the DataSet know that the BusinessEntityID is an ID field with an Identity Specification (which means we don’t have to set it ourselves), but our Person needs that same ID (we can use FillSchema for this!). Last we need to issue an update for each table seperately, which means creating seperate SqlCommands, SqlDataAdapters and SqlCommandbuilders…

DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntityrn" +
    "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.TableMappings.Add("Person.BusinessEntity", "BusinessEntity");
    adapter.TableMappings.Add("Person.Person", "Person");
    adapter.FillSchema(set, SchemaType.Source);
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
    set.Relations.Add("Relation", set.Tables["Businessentity"].Columns["ID"], set.Tables["Person"].Columns["ID"]);
}

DataTable businessEntityTable = set.Tables["BusinessEntity"];
DataRow newBusinessEntityRow = businessEntityTable.NewRow();
businessEntityTable.Rows.Add(newBusinessEntityRow);

DataTable personTable = set.Tables["Person"];
DataRow newPersonRow = personTable.NewRow();
newPersonRow["PersonType"] = "EM";
newPersonRow["NameStyle"] = 0;
newPersonRow["FirstName"] = "Sander";
newPersonRow["LastName"] = "Rossel";
newPersonRow["EmailPromotion"] = 0;
newPersonRow.SetParentRow(newBusinessEntityRow);
personTable.Rows.Add(newPersonRow);

if (set.HasChanges())
{
    DataSet changes = set.GetChanges();
    using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
    using (SqlCommand businessEntityCmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity", connection))
    using (SqlDataAdapter businessEntityAdapter = new SqlDataAdapter(businessEntityCmd))
    using (SqlCommandBuilder businessEntityBuilder = new SqlCommandBuilder(businessEntityAdapter))
    using (SqlCommand personCmd = new SqlCommand(
        "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
    using (SqlDataAdapter personAdapter = new SqlDataAdapter(personCmd))
    using (SqlCommandBuilder personBuilder = new SqlCommandBuilder(personAdapter))
    {
        businessEntityAdapter.Update(changes, "BusinessEntity");
        personAdapter.Update(changes, "Person");
        set.AcceptChanges();
    }
}

We can also delete rows. In this database it’s a bit difficult because everything is related and some entities simply cannot be deleted (only made inactive). But we can, of course, delete our just inserted row. Most of the code above remains the same, except the part where we created the DataRows. I’ll use some LINQ to retrieve the last inserted BusinessEntity (assuming it’s the row we just inserted). Once I got the row I’ll first delete the child rows (Persons) and then I’ll delete the BusinessEntity.
Now here comes the tricky part, in the example above we first needed to insert the BusinessEntity to generate an ID so we could insert the Person. For deletion it’s the other way around! That means that if you would do updates, inserts and deletes all in one you’d get an exception. Either your inserts fail because you insert child rows before their parent rows are inserted or you delete parent rows before their child rows are deleted.
You can use an overload on GetChanges that accepts a DataRowState to create two DataSets, one with updates and inserts and one with deletes and execute them in the correct order.
In the next example I’ll use the DataRowState, but I’m only deleting rows.

DataSet set = new DataSet();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntityrn" +
    "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.TableMappings.Add("Person.BusinessEntity", "BusinessEntity");
    adapter.TableMappings.Add("Person.Person", "Person");
    adapter.FillSchema(set, SchemaType.Source);
    adapter.Fill(set);
    set.Tables[0].TableName = "BusinessEntity";
    set.Tables[1].TableName = "Person";
    set.Relations.Add("Relation", set.Tables["Businessentity"].Columns["ID"], set.Tables["Person"].Columns["ID"]);
}

DataTable table = set.Tables["BusinessEntity"];
List rows = table.Rows.Cast().ToList();
int maxId = table.Rows.Cast().Max(b => b.Field("ID"));
DataRow row = rows.Single(be => be.Field("ID") == maxId);
foreach (DataRow personRow in row.GetChildRows("Relation"))
{
    row.Delete();
}
row.Delete();

if (set.HasChanges())
{
    DataSet changes = set.GetChanges(DataRowState.Deleted);
    using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
    using (SqlCommand businessEntityCmd = new SqlCommand("SELECT BusinessEntityID AS ID FROM Person.BusinessEntity", connection))
    using (SqlDataAdapter businessEntityAdapter = new SqlDataAdapter(businessEntityCmd))
    using (SqlCommandBuilder businessEntityBuilder = new SqlCommandBuilder(businessEntityAdapter))
    using (SqlCommand personCmd = new SqlCommand(
        "SELECT BusinessEntityID AS ID, PersonType, NameStyle, FirstName, MiddleName, LastName, EmailPromotion FROM Person.Person", connection))
    using (SqlDataAdapter personAdapter = new SqlDataAdapter(personCmd))
    using (SqlCommandBuilder personBuilder = new SqlCommandBuilder(personAdapter))
    {
        personAdapter.Update(changes, "Person");
        businessEntityAdapter.Update(changes, "BusinessEntity");
        set.AcceptChanges();
    }
}

And there you have it! No one said working with DataSets was easy… In this post we looked at DataSets and SqlDataAdapters, it is possible to use DataSets with other data sources such as XML though! In fact, DataSets and DataTables are unaware of their data source, you can fill them however you like. Their structure also makes them perfect for binding to DataGridViews in WinForms (and probably WPF too, but I wouldn’t really know).
Their weakness is that they are not strong-typed. We are forced to work with strings to refer to table and column names, and all values contained in them are objects (which means they could be anything).
Microsoft saw this and created so-called Typed DataSets. I recommend you skip Typed DataSets and move straight to LINQ-To-SQL or the Entity Framework. That’s a whole different topic though.

Comments are welcome. Happy coding!

A first look at NoSQL and MongoDB in particular

So today I decided to have a look at NoSQL. It’s not exactly new and actually I’m a bit late to jump on the NoSQL train, but so far I had no need for it (and actually I still don’t, but I had some time to spare and a blog to write). Since NoSQL can be quite complicated, as it imposes a new way of thinking about storing data, and I can’t possibly discuss everything there is to discuss, I’ll add some additional reading at the end of the article.

An overview of NoSQL

First things first, what is NoSQL? As the name implies it’s not SQL (Structured Query Language), a standard for databases to support the relational database model. As SQL has been the standard for about thirty to twenty years I’m not going to discuss it, you probably know it. A common misunderstanding with NoSQL is that it stands for “no SQL”, while it actually means “Not Only SQL”, which implies there is at least some SQL-y goodness to be had in NoSQL as well. Whatever that SQL-y goodness may be it’s not the relational model. And this is where NoSQL is fundamentally different from SQL, expect de-normalized and duplicated data. This ‘feature’ makes it possible to make schema’s flexible though. In NoSQL it’s generally easy to add fields to your database. Where in a SQL database you would possibly lock a table for minutes if it contains a bit of data, in NoSQL you can add fields on the fly (during production!). Querying data can also go faster than your typical SQL database, because of the de-normalization you reduce or even eliminate expensive joins. A downside to this method of storing data is that is it harder to get consistency in your data. Where in SQL consistency is more or less guaranteed if you have normalized your database NoSQL offers consistency or eventual consistency. How NoSQL databases provide this (eventual) consistency differs per vendor, but it doesn’t come as natural as in SQL databases. Also, because of the way data is stored and queried NoSQL databases tend to scale better across machines than SQL databases.
Other than that no uniform definition can be given for NoSQL because there is no standard. Still NoSQL can be roughly divided into four database models (some would say more, let’s not get into such details): Document, Graph, Key-value and Wide Column. So let’s get a quick overview of those and try one out!

The Document Model

First there’s the Document model. When thinking of a document don’t think of a Word or Excel document, think of an object like you would have in an object-oriënted language such as Java or C#. Each document has fields containing a value such as a string, a date, another document or an array of values. The schema of a document is dynamic and as such it’s a breeze to add new fields. Documents can be queried on any field.
Because a value can be another document or array of documents data access is simplified and it reduces or even eliminates the use for joins, like you would need in a relational database. It also means you will need to de-normalize and store redundant data though!
Document model databases can be used in a variety of applications. The model is flexible and documents have rich query capabilities. Additionally the document structure closely resembles objects in modern programming languages.
Some examples of Document databases are MongoDB and CouchDB

The Graph Model

Next there’s the Graph model. This model, like its name implies, stores data in graphs, with nodes, edges and properties to represent the data. A graph is a mathematical structure and I won’t won’t go into it any further. Graph databases model data as networks of relationships between entities. Sounds difficult? I think so too. Anyway, when your application is based on various relationships, such as social networks, the graph database is the way to go.
Some examples of Graph databases are HyperGraphDB and Neo4j.

The Key-value Model

Key-value databases are the simplest of the NoSQL databases. They basically provide a key and a value, where the value can be anything. Data can be queried by key only. Each key can have a different (type of) value. Because of this simplicity these databases tend to be highly performant and scalable, however, because of this simplicity, they’re also not applicable to many applications.
Some examples of Key-value databases are Redis and Riak.

The Wide Column Model

Last is the Wide Column model. Like the Key-value model the Wide Column model consists of a key on which data can be queried, can be highly performant and isn’t for each application. Each key holds a ‘single’ value that can have a variable number of columns. Each column can nest other columns. Columns can be grouped into a family and each column can be part of multiple column families. Like the Object model the schema of a Wide Column store is flexible. Phew, and I though the Graph model was complicated!
Some examples of Wide Column databases are Cassandra and HBase.

Getting started with MongoDB

So anyway, there you have it. I must admit I haven’t actually used any of them, but I’m certainly planning to get into them a bit deeper. And actually, as promised, I’m going to try one out right now! I’ve picked MongoDB, one of the fastest growing databases of the moment. It’s a Document store and so has a wider applicability than the other types. You can download the free version at www.mongodb.org. There’s also a lot of documentation on there, so I recommend you look around a bit later. Installation is pretty straightforward. Just click next a few times and install. If you change any settings I won’t be held responsible if it doesn’t work or if you can’t follow the rest of this post. So go ahead, I’ll wait.
Ready? Once you have installed MongoDB you’ll need to run it. I was a bit surprised it doesn’t run as a service (like, for example, SQL Server) by default.
So how do you start MongoDB? Open up a command window (yes, really). First you need to create the data directory where MongoDB stores its files. The default is data\db, to create it type md data\db in your command window. Next you need to navigate to the folder where you’ve installed MongoDB. For me this was C:\Program Files\MongoDB 2.6 Standard\bin. Then start mongod.exe. If, like me, you’ve never had to work with a command window here’s what you need to type in your command window:

cd C:\
md data\db
cd C:\Program Files\MongoDB 2.6 Standard\bin
mongod.exe

If you still encounter problems or you’re not running Windows you can check this Install MongoDB tutorial. It also explains how to run MongoDB as a service, so recommended reading material there!

You might be wondering if MongoDB has a Management System where we can query and edit data without the need of a programming language. You can use the command window to issue JavaScript commands to your MongoDB database. To do this you’ll need to start mongo.exe through a command window. The Getting Started with MongoDB page explains this in greater detail. However I would HIGHLY RECOMMEND that you download MongoVUE instead. It’s an easy to use, graphical, management system for MongoDB. Do yourself a favour and install it before you read any further. You can check out the data we’ll be inserting and editing in the next paragraphs.

One more thing before we continue. Mongo stores its documents as BSON, which stands for Binary JSON. It’s not really relevant right now, but it’s good to know. We’ll see some classes named Bson*, now you know where it comes from. MongoVUE let’s you see your stored documents in JSON format.

The C# side of MongoDB

So now that we are running MongoDB start up a new C# Console project in Visual Studio. Make sure you have saved your project (just call it MongoDBTest or something). Now open up the Package Manager Console, which can be found in the menu under Tools -> Library Package Manager -> Package Manager Console. Getting MongoDB to work in your project is as simple as entering the following command: PM> Install-Package mongocsharpdriver. The MongoDB drivers will be installed and added to your project automatically. Make sure you import the following namespaces to your file:

using MongoDB.Bson;
using MongoDB.Bson.Serialization.Attributes;
using MongoDB.Driver;
using MongoDB.Driver.Builders;
using MongoDB.Driver.Linq;
using System;
using System.Linq;

So are you ready to write some code? First we’ll need something we want to store in our database, let’s say a Person. I’ve created the following class to work with when we start.

public class Person
{
    public ObjectId Id { get; set; }
    public string Name { get; set; }
}

Classes don’t come easier. Notice I’ve used the ObjectId for the Id field. Using this type for an ID field makes Mongo generate an ID for you. You can use any type as an ID field, but you’ll need to set it to a unique value yourself (or you’ll overwrite the record that already has that ID).  Another gotcha is that you need to call your ID field Id (case-sensitive) or annotate it with the BsonIdAttribute. And since we’re talking about Attributes, here’s another one that’ll come in handy soon, the BsonIgnoreAttribute. Properties with that Attribute won’t be persisted to the store.

public class Person
{
    [BsonId()]
    public ObjectId MyID { get; set; }
    public string Name { get; set; }
    [BsonIgnore()]
    public string NotPersisted { get; set; }
}

For now we’ll work with the default Id field. So now let’s make a connection to our instance and create a database. This is actually rather easy as you’ll see. Mongo creates a database automatically whenever you put some data in it. After we got a connection to our database we’ll want to put some data in that database. More specific, we want to create a Person and store it. To do this we’ll first ask for a collection of Persons with a specific name (a table name, if you like). You can store multiple collections of Persons if you use different names for the collections, so beware for typo’s! After we got a collection from the database we’ll create a Person and save it to the database. That’s a lot of stuff all at once, but actually the code is so simple you’ll get it anyway!

// Connect to the database.
string connectionString = "mongodb://localhost";
MongoClient client = new MongoClient(connectionString);
MongoServer server = client.GetServer();
MongoDatabase database = server.GetDatabase("testdb");

// Store a person.
MongoCollection persons = database.GetCollection("person");
Person p1 = new Person() { Name = "Sander" };
persons.Save(p1);
Console.WriteLine(p1.Id.ToString());
Console.ReadKey();

Wow, that was pretty easy, wasn’t it!? Mongo generated an ID for you, as you can see. Next we’re going to get this Person back from our database. There’s a few ways to do this. We can work using the Mongo API or we can use LINQ. Both present multiple methods of querying for one or multiple records. I suggest you read the documentation and experiment a bit. I’ll already show you a couple of methods to get our Person back from the database.

// Using the MongoDB API.
ObjectId id = p1.Id;
Person sanderById = persons.FindOneById(id);
Person sanderByName = persons.FindOne(Query.EQ(p => p.Name, "Sander"));

// Using LINQ.
var sandersByLinq = from p in persons.AsQueryable()
                    where p.Name == "Sander"
                    select p;
Person sander = sandersByLinq.SingleOrDefault();

You’ll notice the Query.EQ. EQ stands for equal and builds a query that tests if a field is equal to a specific value. There are other query types like GT (Greater Than), LT (Less Than), In, Exists etc.

But wait, I’m not happy with this code at all! What Person really needs are LastName and Age fields. Now here comes this flexible schema I’ve been telling you about. Simply add the properties to your class. If you’ll fetch a Person that doesn’t have these fields specified they’ll be set to a default value. In case of Age you might want to use an int? rather than an int, or your already existing Persons will have an age of 0 rather than null.

Person incompleteSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
Console.WriteLine(String.Format("{0}'s last name is {1} and {0}'s age is {2}",
    incompleteSander.Name, incompleteSander.LastName, incompleteSander.Age.ToString()));

incompleteSander.LastName = "Rossel";
incompleteSander.Age = 27;

// Let's save those new values.
persons.Save(incompleteSander);

Console.ReadKey();
// Retrieve the person again, but this time with last name and age.
Person completeSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
Console.WriteLine(String.Format("{0}'s last name is {1} and {0}'s age is {2}",
    completeSander.Name, completeSander.LastName, completeSander.Age.ToString()));

Console.ReadKey();

Now let’s also add an address to Person. Address will be a new class and Person will hold a reference to an Address. Now you can just model this like you always would.

public class Person
{
    public ObjectId Id { get; set; }
    public string Name { get; set; }
    public string LastName { get; set; }
    public int? Age { get; set; }
    public Address Address { get; set; }
}

public class Address
{
    public string AddressLine { get; set; }
    public string PostalCode { get; set; }
}

Notice that Address doesn’t need an Id field? That’s because it’s a sub-document of Person, it doesn’t exist without a Person and as such doesn’t need an Id to make it unique. Now fetch your already existing Person from the database, check that it’s address is empty, create an address, save it and fetch it again.

Person addresslessSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
if (addresslessSander.Address != null)
{
    Console.WriteLine(String.Format("Sander lives at {0} on postal code {1}", addresslessSander.Address.AddressLine, addresslessSander.Address.PostalCode));
}
else
{
    Console.WriteLine("Sander lives nowhere...");
}

addresslessSander.Address = new Address() { AddressLine = "Somewhere", PostalCode = "1234 AB" };
persons.Save(addresslessSander);

Person addressSander = persons.FindOne(Query.EQ(p => p.Name, "Sander"));
if (addressSander.Address != null)
{
    Console.WriteLine(String.Format("Sander lives at {0} on postal code {1}", addressSander.Address.AddressLine, addressSander.Address.PostalCode));
}
else
{
    Console.WriteLine("Sander lives nowhere...");
}

Console.ReadKey();

Make sure you check out the JSON in MongoVUE. Also try experimenting with Lists of classes. Try adding more Addresses, for example. We haven’t deleted or updated any records either, we’ve only overwritten entire entries. Experiment and read the documentation.

We’ve now scratched the surface of NoSQL and MongoDB in particular. Of course MongoDB has a lot more to offer, but I hope this post has helped getting your feet wet in NoSQL and MongoDB. Perhaps it has given you that little push you needed to get started. It has for me. Expect more NoSQL blogs in the future!

Additional reading

As promised, here’s some additional reading:
NoSQL – Wikipedia
MongoDB White Papers
Document Databases : A look at them
How to take advantage of Redis just adding it to your stack

Comments are welcome. Happy coding!

Using C# to connect to and query from a SQL database

As a developer you’ll probably spend a lot of time getting data in and out of a database. Data is important in any organization and your job as a developer is to present that data to a user, have them add or edit that data and store it back to the database.

Yet I have found that many developers really have no clue how to work with a database! Many developers can get data out of databases, but do so in an unsafe way that may break your code and, worse, give hackers an opportunity to get direct access to your database! Others use an ORM like NHibernate, Entity Framework or LINQ To SQL, but have no idea what’s going on. In this blog post I will address these issues: how to setup a database connection, query for data in a secure manner and use that data in your code. I’ll also show you how to push data back to a database.

I am assuming you know how to set up a database and you know your way around C# and the .NET Framework. For my example I have used the Adventure Works 2014 Sample Database on a SQL Server 2014 database.

Creating a Connection

So let’s start. To create a connection to a database you’ll first need a database connection object. In our case we need a specific type of connection object, being the SqlConnection. Using the SqlConnection you can configure all kinds of settings that are used for your current session to the database. In this blog we’ll use defaults only. For creating the SqlConnection we’ll use the constructor that takes a (connection)string as input parameter. Usually you’d get the connectiongstring from a config file or some such. Alternatively you can create one using the SqlConnectionStringBuilder, but I won’t go into that here. Notice that I’ve wrapped the SqlConnection in a using block. This ensures that the connection is actually closed once we’re done with it. Make sure you actually open the connection only when needed.

using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
{
    connection.Open();
}

Creating a Command

Unfortunately this doesn’t do anything yet. We’ll need a SqlCommand which takes the query we want to send to the database. In this case I’m going to select all persons from the table Person.Person. We can create a command object in different ways, but I’m going to create one using the constructor that takes the query and our just created connection. Once we have created the command we must open the connection (which we already did) and have it execute our query. There are a few ways to have the command actually execute your query.

Executing a Command

The first is ExecuteNonQuery, which seems odd because we are going to execute a query, right? Well actually you use this method when you don’t expect a result (perhaps and update statement, or a call to a Stored Procedure that returns no resultset).

The second method, and the one we’ll need in this example, is ExecuteReader. This method returns a SqlDataReader which represents a forward-only stream of rows from the database. The columns of each row can be accessed by index or name. We’ll see how to use the SqlDataReader in the next example.

The third method, and last I will discuss, is ExecuteScalar. You can use this method when you expect exactly one result from a query.
There’s also an ExecuteXmlReader method which I will not discuss here. Additionally every method has its async versions. For older versions of .NET these are the BeginExecute and EndExecute methods and for later versions of .NET these are the ExecuteAsync methods. I will not discuss them here.

So let’s look at our example. We’re going to create a command to fetch some data from the Person.Person table and use ExecuteReader to get our results.

List persons = new List();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
{
    connection.Open();
    using (SqlDataReader reader = cmd.ExecuteReader())
    {
        // Check is the reader has any rows at all before starting to read.
        if (reader.HasRows)
        {
            // Read advances to the next row.
            while (reader.Read())
            {
                Person p = new Person();
                // To avoid unexpected bugs access columns by name.
                p.ID = reader.GetInt32(reader.GetOrdinal("ID"));
                p.FirstName = reader.GetString(reader.GetOrdinal("FirstName"));
                int middleNameIndex = reader.GetOrdinal("MiddleName");
                // If a column is nullable always check for DBNull...
                if (!reader.IsDBNull(middleNameIndex))
                {
                    p.MiddleName = reader.GetString(middleNameIndex);
                }
                p.LastName = reader.GetString(reader.GetOrdinal("LastName"));
                persons.Add(p);
            }
        }
    }
}
// Use persons here...

You may have noticed that getting a value from a SqlDataReader isn’t easy! There are methods like GetString, GetInt32, GetBoolean, etc. to convert values from their database representation to their CLR type equivalents. Unfortunately they throw on DBNull values. So in case of MiddleName, which is a NULLABLE column in the database, we need to check for DBNull before setting the MiddleName value. In case of integer or booleans (or any non-nullable type) we would use the nullable equivalents of those types like int? or bool? (which is short for Nullable<T>).

Using an Adapter

Another method to get data from the database is by using a SqlDataAdapter. This results in a DataTable or DataSet (for multiple resultsets) containing the database data. I won’t go into the use of DataTables and DataSets, but they are like in-memory GridViews. They even track if a row was changed and can automatically generate update, insert or delete commands when used with a SqlCommandBuilder.
The next code snippet shows how to fill a DataTable (that’s a lot less code than the SqlDataReader example, but keep in mind that the result is also very different).

DataTable table = new DataTable();
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person", connection))
using (SqlDataAdapter adapter = new SqlDataAdapter(cmd))
{
    adapter.Fill(table);
}
// Use table here...

SQL Injection

For the next example we are going to select a subset of persons by first name. That means we’ll have to change our query. Let’s look at an example.

string firstName = "John";
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person WHERE FirstName = '" + firstName + "'", connection))
{
    // ...
}

Looking good, right? NO! THIS IS REALLY VERY WRONG! For John this works great (I’ll tell you in a moment why it works, but still isn’t great), but for D’Artagnan (a musketeer) this won’t work at all! While the apostrophe is all good in C# it ends a string in SQL. So the query you’ll be sending to SQL is SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person WHERE FirstName = D'Artgnan. Go to SQL server Management Studio, open a new query window and try to run that exact query. You’ll get an error message saying something about an unclosed quotation mark. What it should’ve been was D”Artagnan. But even replacing every apostrophe with double apostrophe won’t work.

Whenever you send a query to SQL Server a query plan is made and the fastest way to get your data is calculated. For our query SQL Server might decide it will use an index we placed on FirstName. Once the plan is decided it’s cached and re-used when the exact same query is called. In our example that would mean a plan is made and cached for each name we look for! That’s not very efficient since every plan will probably be the same anyway…

You’ve been HACKED!

What’s even worse and THIS IS VERY IMPORTANT is that by concatenating strings to form a query like that is a HUGE SAFETY RISK! Maybe you’ve heard of SQL Injection Attacks. Let me demonstrate this. Let’s assume for a moment that the user gets a textbox to enter a name and that name is concatenated to your query like above. Now the user enters John'; USE master; DROP DATABASE AdventureWorks2014 -- and BAM! There goes your database… Really, it’s gone. I hope you have a backup. This technique is used to get personal information of users like email addresses and passwords.
Here is a mandatory xkcd on the subject:

xkcd: Exploits of a Mom

Parameterization

So how are we going to solve these problems? Parameterization! By creating parameterized queries the query plan can be re-used for different values and SQL injection belongs to the past! So how does this look?

string firstName = "John";
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("SELECT BusinessEntityID AS ID, FirstName, MiddleName, LastName FROM Person.Person WHERE FirstName = @FirstName", connection))
{
    cmd.Parameters.AddWithValue("FirstName", firstName);
    connection.Open();
    using (var reader = cmd.ExecuteReader())
    {
        // ...
    }
}

And that’s how easy it is! Notice that by adding a parameter we also improved the readability of our code. Wow, that’s a win-win-win situation!

There is one caveat though, when you want to pass a NULL to the database you’ll have to use the DBNull.Value object instead of simply null. So when fetching data we converted DBNull to null and now we’ll have to convert null to DBNull. We’ll see this happening in the next example.

Now what if we want to update, insert or delete a record in the database? We can go about it in much the same way, but use ExecuteNonQuery (which returns the number of affected rows only).

int businessEntityID = 1;
string firstName = "Sander";
string middleName = null;
string lastName = "Rossel";
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("UPDATE Person.Person SET FirstName = @FirstName, MiddleName = @MiddleName, LastName = @LastName WHERE BusinessEntityID = @BusinessEntityID", connection))
{
    cmd.Parameters.AddWithValue("FirstName", firstName);
    if (middleName == null)
    {
        cmd.Parameters.AddWithValue("MiddleName", DBNull.Value);
    }
    else
    {
        cmd.Parameters.AddWithValue("MiddleName", middleName);
    }
    cmd.Parameters.AddWithValue("LastName", lastName);
    cmd.Parameters.AddWithValue("BusinessEntityID", businessEntityID);
    connection.Open();
    cmd.ExecuteNonQuery();
}

I have to add that it’s generally a good idea to check for null for ALL your parameters. You can make a helper function to prevent your code from cluttering up to much.
And in case you want your original Person back, here are his first-, middle- and last name: Ken J Sánchez.

Stored Procedures

So far we have only worked with plain text queries. Many times you’ll want to execute a stored procedure. This works in much the same way as sending your query to the database. You simply have to set the CommandType of your command to StoredProcedure and pass in the parameters.

int businessEntityID = 1;
string nationalIDNumber = "295847284";
DateTime birthDate = new DateTime(1987, 11, 8);
char maritalStatus = 'S';
char gender = 'M';
using (SqlConnection connection = new SqlConnection("Data Source=(local);Initial Catalog=AdventureWorks2014;Integrated Security=SSPI"))
using (SqlCommand cmd = new SqlCommand("HumanResources.uspUpdateEmployeePersonalInfo", connection))
{
    cmd.CommandType = CommandType.StoredProcedure;
    cmd.Parameters.AddWithValue("BusinessEntityID", businessEntityID);
    cmd.Parameters.AddWithValue("NationalIDNumber", nationalIDNumber);
    cmd.Parameters.AddWithValue("BirthDate", birthDate);
    cmd.Parameters.AddWithValue("MaritalStatus", maritalStatus);
    cmd.Parameters.AddWithValue("Gender", gender);
    connection.Open();
    cmd.ExecuteNonQuery();
}

In case you want your original employee back, here is his original birthdate: 1969-01-29.

Other Databases

Perhaps you have noticed that the SqlConnection inherits from DbConnection which implements IDbConnection. We have also used other classes like the SqlCommand and SqlDataReader which inherit from DbCommand and DbDataReader in a same manner. The only thing you need to know right now is that many database providers have these classes as a common base class which means that if you know how to connect to SQL Server you (more or less) know how to connect to most SQL databases like Oracle, MySQL, PostgreSQL, Firebird, etc. In theory (and probably in practice too, although I’ve never tried) you can create a flexible data layer that can switch seamlessly between (SQL) databases because of these common base classes and interfaces.

Wrap up

Well, there you have it. We have successfully and correctly selected data, updated data and executed a stored procedure using C#. I assume you can now guess how to use ExecuteScalar, which I mentioned, but haven’t discussed further. Things don’t stop here though. There’s much more like queries that return multiple result sets, stored procedures that return output parameters, BLOB’s, bulk operations, transactions… Way to much to discuss here. Luckily there are many books, articles and blogs on the subject.

Happy coding!