Using Entity Framework Code First with an existing database – Part 1

If you do any sort of .NET development using a relational database then chances are pretty good that you’d use Entity Framework as your ORM (object-relational mapping) technology – and in particular Entity Framework Code First since that’s the Entity Framework approach that Microsoft appears to favor going forward.

So what exactly is Entity Framework Code First?

The Entity Framework Code First approach allows us to write Plain Old CLR Objects (POCOs) for our data models and then persist them in a data store using the DbContext class. Model classes are much cleaner this way and they are developer-friendly – gone are the days of complex (and often buggy) XML .edmx files which were needed in the past with Entity Framework in order to describe and map the model layer to the physical data storage layer.

Below is a simple example of such a POCO class:

[Table("Books")] // Table name
public class Book
{
    [Key] // Primary key
    public int Id { get; set; }
    public string Title { get; set; }
    public string ISBN { get; set; }

    // Navigational property for reviews
    public virtual ICollection<Review> Reviews { get; set; }
}

We can see here that certain attributes are used to describe some of the physical properties of the SQL table and its columns. Entity Framework Code First makes heavy use of conventions in order to determine what to expect at the physical data storage layer given a particular POCO class that describes the model.

One of the reasons Entity Framework Code First has this particular name is because it is meant for developers to quickly jump into defining model classes which EF will then use to actually build the physical layer structures (tables) using the conventions and attributes mentioned above. The process through which that happens is known as Entity Framework Code First Migrations. EF Code First Migrations can determine what’s different between the logical model and the physical data store and issue the proper SQL commands to adjust the database as needed.

But this is not the focus of this particular article series. I’d like to look at the opposite scenario where the database layer is actually ‘ahead’ of the Entity Framework logical POCO layer and see what options we have for getting our EF model classes in sync with the database. This process is known as reverse engineering a database with Entity Framework Code First or sometimes referred to as Entity Framework Code “Second” (since the code model class updates come second, after the database changes were made).

When is such a scenario likely to happen? I can think of at least 2 instances where reverse engineering a database would be very useful:

You have an existing database already and you need to start a new software development project using Entity Framework Code First that works with that existing database.
You are working on a new project where the database is being developed at the same time with the .NET application code but where the person maintaining the database (most likely a DBA or somebody who prefers database management tools to .NET classes) manages its schema outside of the .NET project (most likely using a tool such as SQL Server Management Studio).

In both cases we need a reliable way to automatically generate these Entity Framework Code First POCO classes.

So what options do we have? Until Entity Framework 6.1 there was nothing ‘official’ included with the Entity Framework package that could help with this task – users would have to rely on third-party Visual Studio extensions in order to reverse engineer a database schema for EF Code First.

In this article we’ll look at how to use one such extension and then later other articles will follow with additional options including the official method that was introduced in Entity Framework 6.1.

The Visual Studio extension we’ll start with is Entity Framework Power Tools Beta 4.

In Visual Studio 2013 go to Tools -> Extensions and Updates and in the new window that comes up click on Online in the left and then search the Visual Studio Gallery for ‘Entity Framework Power Tools Beta 4’.

Push Download and finish the extension installation process.

Let’s create a quick application in Visual Studio 2013 where we’ll use this new extension to reverse engineer the sample AdventureWorks SQL Server database and quickly test that we can access the database with the newly generated Entity Framework Code First POCO model classes. (The various AdventureWorks databases for SQL Server can be downloaded from this location – http://msftdbprodsamples.codeplex.com/.)

I created a new solution called AdventureWorksReverseEngineer with a console application project AdventureWorksPowerTools.

We will first add the latest version of the Entity Framework nuget package to our console application. Go to Tools -> NuGet Package Manager -> Package Manager Console to open the nuget console. Once there, install the latest version of the EF package (currently at 6.1.1) with this command:

PM> Install-Package EntityFramework

Now we will use the extension we installed earlier to reverse engineer the database and get our model classes. Right-click on the AdventureWorksPowerTools project and go to Entity Framework -> Reverse Engineer Code First.

In the next dialog box set the correct connection properties for the AdventureWorks database and let the extension do its work.

… and it almost works except that we get a strange error in the Output window:

error 6004: The table ‘AdventureWorks.Production.Document’ is referenced by a relationship, but cannot be found.

It turns out that it’s a pretty common error with Entity Framework and the AdventureWorks database – for example http://entityframework.codeplex.com/workitem/695. It’s due to the fact that the Production.Document table uses a data type of hierarchyid which isn’t yet officially supported in EF 6.1.1.

Since Entity Framework is now an open-source project there is a user-maintained nuget package that appears to add EF support for hierarchyid – https://www.nuget.org/packages/EntityFrameworkWithHierarchyId/ … but we will not go down this route.

Since we’re just trying to test the reverse engineer process we’ll simply remove from the list of available tables those using this particular data type. But how will we actually do this since the reverse engineer extension gave us no option to select which tables to reverse engineer (it attempted to work on all tables in the selected database)?

This is actually one of the main shortcomings of this particular extension – the fact that it offers no user-selection of the tables on which to perform the reverse engineer process. A solution to this problem would be to attempt the reverse engineer process using credentials for a SQL Server user who’s only allowed to view/read the actual list of SQL tables we want to work on … but that’s far from ideal.

For now we’ll just quickly delete the problem tables from the database itself – Production.ProductDocument and Production.Document – and run the reverse engineer process again.

This time it works. We get a new Models folder in the project containing all our POCO model classes with a Mapping subfolder for classes that further describe physical SQL table properties.

As an example this is the generated code for the Customer class:

using System;
using System.Collections.Generic;

namespace AdventureWorksPowerTools.Models
{
    public partial class Customer
    {
        public Customer()
        {
            this.SalesOrderHeaders = new List<SalesOrderHeader>();
        }

        public int CustomerID { get; set; }
        public Nullable<int> PersonID { get; set; }
        public Nullable<int> StoreID { get; set; }
        public Nullable<int> TerritoryID { get; set; }
        public string AccountNumber { get; set; }
        public System.Guid rowguid { get; set; }
        public System.DateTime ModifiedDate { get; set; }
        public virtual Person Person { get; set; }
        public virtual SalesTerritory SalesTerritory { get; set; }
        public virtual Store Store { get; set; }
        public virtual ICollection<SalesOrderHeader> SalesOrderHeaders { get; set; }
    }
}

It’s also worth noting that the EF Power Tools extension added a connection string entry in app.config for the database we just worked on:

<connectionStrings>
  <add name="AdventureWorksContext" connectionString="Data Source=localhost;Initial Catalog=AdventureWorks;Integrated Security=True;MultipleActiveResultSets=True"
    providerName="System.Data.SqlClient" />
</connectionStrings>

We can now write some quick code in our console app to prove that we can access our data with the newly generated POCO classes:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using AdventureWorksPowerTools.Models;

namespace AdventureWorksPowerTools
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var ctx = new AdventureWorksContext())
            {
                var customers = ctx.Customers.Where(c => c.PersonID != null).Take(10);

                foreach (var customer in customers)
                {
                    Console.WriteLine(string.Format("Customer ID: {0} - First Name: {1} - Last Name: {2}",
                        customer.CustomerID,
                        customer.Person.FirstName,
                        customer.Person.LastName));
                }
            }

            Console.ReadLine();
        }
    }
}

And here’s the output from running the console app:

We were able to easily generate POCO classes for our entity model and use them to access data in our database. If changes are made to the database schema and we want to incorporate them into our code we can repeat the EF reverse engineer process from the Visual Studio extension we just installed above. It will overwrite all the previously generated Models and Mapping class files.

Some closing thoughts regarding this particular Visual Studio extension for reverse engineering a database schema:

The “Entity Framework Power Tools Beta 4” Visual Studio extension is maintained by the Microsoft Entity Framework team but its last update was from 10/12/2013. It’s likely that it will not see any future updates as most of the relevant features were migrated to an official Entity Framework tools package starting with EF version 6.1
Any time the reverse engineer process is repeated it ends up overwriting the previously generated entity model classes so those classes can’t be directly customized in the generated C# files because those changes would be lost next time the process is repeated. The correct approach instead is to pick the option to Customize Reverse Engineer Templates which will add some additional T4 template files to the project which can be modified to affect the generated classes the next time they’re reversed engineered.
As mentioned already above this extension does not allow the user to select which SQL objects inside a database to be included in the process which can be a problem.
The reverse engineer process ends up creating 2 class files for each entity. On my system there were about 90 objects in the AdventureWorks database and it took about 1 minute for the process to complete. For databases with many more tables this process can take quite a while – just something to keep in mind.

So there you have it – a quick way to reverse engineer a database schema in order to build an entity model that can be used with Entity Framework Code First.

The complete Visual Studio solution can be found here:

https://github.com/csatnic/AdventureWorksReverseEngineer

In future articles I will look at other options for accomplishing the same task while trying to overcome some of the difficulties mentioned above.