3 ways to avoid an anemic domain model in EF Core

Anemic (anaemic) domain models are extremely common when using ORM's such as Entity Framework. This post looks at the problems of having an anemic domain model and then goes on to look at a few simple techniques to allow you to create richer models when using Entity Framework Code First and EF Core.

What is an anemic domain model?

An anemic domain model is where you go to the trouble of modeling your domain as a set of classes but those classes contain no business logic. Instead the classes are simply data holders and when using Entity Framework they often consist of little more than a bunch of public getters and setters:

public class BlogPost
{
    public int Id { get; set; }

    [Required]
    [StringLength(250)]
    public string Title { get; set; }

    [Required]
    [StringLength(500)]
    public string Summary { get; set; }

    [Required]
    public string Body { get; set; }

    public DateTime DateAdded { get; set; }

    public DateTime? DatePublished { get; set; }

    public BlogPostStatus Status { get; set; }
    ...
}

Anemic domain models are often described as an anti-pattern due to a complete lack of OO principles. They are dumb objects relying on calling code for validation and other business logic. This lack of abstraction can often lead to code repetition, poor data integrity and increased complexity in higher layers. Anemic domain models are extremely common. From my experience (consulting for dozens of companies), I would estimate that more than 80% of EF domain models are anemic. This is hardly surprising. Practically all documentation and other blog articles demonstrate EF at its simplest. They are focused on getting you started as quickly as possible rather than advocating best practices (which may require slightly more work to set up initially).

Moving to a richer domain model

We are going to discuss three easy techniques to enrich your anemic domain models. These are very simple changes that can be adopted with minimal effort:

Remove public parameterless constructors

Unless you specify a constructor, your class will have a default parameterless constructor. This means that you can instantiate your class in the following way:

var blogPost = new BlogPost();

In most situations this makes no sense. Domain objects will typically require at least some data to make them valid. Creating a BlogPost instance without any data such as title or URL is meaningless. Without some useful data for identification, there is no point in allowing such an instance. Some people disagree but the prevailing belief in the DDD community is that it makes sense to ensure that domain objects are always valid. To help with this, we can treat our domain class just like any other OO class and introduce a parameterised constructor:

public BlogPost(string title, string summary, string body)
{
    if (string.IsNullOrWhiteSpace(title))
    {
        throw new ArgumentException("Title is required");
    }

    ...

    Title = title;
    Summary = summary;
    Body = body;
    DateAdded = DateTime.UtcNow;
}

Now calling code must provide a minimum of data to satisfy the contract (the constructor). This change provides two positive outcomes:

  1. Any newly instantiated BlogPost object is now guaranteed to be valid. Any code acting on the BlogPost does not need to check for invalid values. The domain object validates itself automatically on instantiation.

  2. Any calling code knows exactly what is required to instantiate the object. With a parameterless constructor, this is unknown and it is very easy to build an object with missing data.

Unfortunately, after making this change, you will find that your EF code no longer works when retrieving entities from the database:

InvalidOperationException: A parameterless constructor was not found on entity type 'BlogPost'. In order to create an instance of 'BlogPost' EF requires that a parameterless constructor be declared.

EF requires a parameterless constructor for querying so what to do? Fortunately, while EF does require the parameterless constructor, it need not be public so we can add a private parameterless constructor for EF while forcing calling code to use the parameterised one. Having the additional constructor is obviously not ideal but these kind of compromises are often required to get ORMs to play nicely with OO code.

private BlogPost()
{
    // just for EF
}

public BlogPost(string title, string summary, string body)
{
    ...
}

Remove public property setters

The parameterised constructor introduced above ensures that when instantiated, the object is in a valid state. This does nothing to prevent you from changing property values to invalid values later though. To fix this issue, we have two options:

  1. Add validation logic to the property setters
  2. Prevent direct modification of properties and instead use methods corresponding to user actions

Adding validation to the property setter is perfectly acceptable but does mean that we can no longer use Automatic Properties and must introduce a backing field. Obviously this is not a big deal:

private string title;

public string Title
{
    get { return title; }
    set
    {
        if (string.IsNullOrWhiteSpace(value))
        {
            throw new ArgumentException("Title must contain a value");
        }

        title = value;
    }
}

The main reason why option two is preferred is that it more closely models what happens in the real-world. Rather than updating a single property in isolation, users tend to perform a set of known actions (determined by the UI or API interface). These actions can result in one or more properties being updated but there is often more to it that that. It is very common to have scenarios where business logic is dependent upon context which can make property setter validation logic complex and hard to understand. As a basic example, consider the following blog post publication process:

public void Publish()
{
    if (Status == BlogPostStatus.Draft || Status == BlogPostStatus.Archived)
    {
        if (Status == BlogPostStatus.Draft)
        {
            DatePublished = DateTime.UtcNow;
        }

        Status = BlogPostStatus.Published;
    }
}

In this example, we have a publish method with some simple logic and two property that can be updated. We could also implement this as a property setter but it is far less clear, particularly when calling it from another class:

blogPost.Status = BlogPostStatus.Published;

vs

blogPost.Publish();

The side effects of the first option are not at all obvious and this lack of clarity should always be avoided.

Of course, what you see in most code bases is no validation in the domain object at all. Instead this type of logic is found in the next layer up. This can result in:

As we might expect by now, EF will not function correctly if we completely remove the setter from every property but changing the access level to private solves the issue well enough:

public class BlogPost
{
    public int Id { get; private set; }
    ...
}

This way, all properties are read-only outside the class. To allow updates to our domain classes, we introduce action-style methods such as the Publish method shown above.

By removing the parameterless constructor and public property setters and adding action-type methods, we now have domain objects which are always valid and contain all the business logic directly related to the entities in question. This is a great improvement. We have made our code more robust and simpler at the same time. All code that we have added to the domain objects can be removed from higher up the call stack where it is often duplicated in several different places.

While we could discuss other DDD concepts such as domain events and the use of the domain services via the double dispatch pattern, their advantages, particularly with respect to simplicity are far less clear cut. One DDD concept which does generally simplify your code, is the use of value objects which we will discuss next.

Introduce value objects

Value objects are immutable (no changes allowed after instantiation) objects which do not have an identity of their own. Value objects can often be used to take the place of one or more properties in a domain object.

Classic examples of value objects include money, addresses and coordinates but it can also be beneficial to replace a single property with a value type instead of using a string or int. For example, instead of storing a phone number as a string, you could create a PhoneNumber value type with built-in validation as well as methods to extract dialing code etc.

The code below shows a money value object implemented as a class for use with EF:

public class Money
{
    [StringLength(3)]
    public string Currency { get; private set; }

    public int Amount { get; private set; }

    private Money()
    {
        // just for EF
    }

    public Money(string currency, int amount)
    {
        // todo validation
        Currency = currency;
        Amount = amount;
    }
}

Currency and Amount are intrinsically linked. Both pieces of information are required for the data to be useful. Therefore it makes sense to model them as such. Note the use of a parameterised constructor and private property setters in exactly the same way as we used when modeling domain objects. A private parameterless constructor is also required here for Entity Framework.

In the context of (RDBMS) data persistence, a value type does not live in a separate db table. To allow us to use value objects in Entity Framework, a small addition is required. This differs depending on the EF version that you are using.

In EF6, we simply decorate the value object with the [ComplexType] attribute:

[ComplexType]
public class Money
{
    ...
}

In EF Core, starting from version 2, we can use the less obvious OwnsOne method of the fluent API:

public class BlogContext : DbContext
{
    ...
    public DbSet<BlogPost> BlogPosts { get; set; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<BlogPost>().OwnsOne(x => x.AdvertisingFee);
    }
}

This assumes that we are using the Money value object on our blog post entity as follows:

public class BlogPost
{
    ...
    public Money AdvertisingFee { get; private set; }
    ...
}

After creating and running a migration, we will find that our database table now contains two additional columns:

AdvertisingFee_Currency
AdvertisingFee_Amount 

The benefits of using value objects are much the same as the move to rich domain models. A rich domain model removes the need for the calling code to validate the domain model and provides a well-defined abstraction to program against. A value object validates itself so the domain model housing the value object property does not itself need to know how to validate the value type and can be simplified. All very clear and simple.

A word of caution

When you move from an anemic domain model to a richer model, your will immediately appreciate the benefits of encapsulating domain level business logic in your domain objects. Note that it is common to try and take things too far however. Creating a method on your domain object to perform validation and then update multiple properties is undoubtedly a good thing. Sending an email from your domain object or saving to a database is not something that you probably want to do. It is important to be aware that having a rich domain model does not negate the requirement of another layer to orchestrate these higher level concerns. This is the job of the application service or the command handler depending on your architecture.

A note about unit testing

One perceived negative of a rich, self-validating domain model is that it can make testing harder. With public setters, you can simply assign individual values to any domain object property. This allows you to assign the exact value you require directly to get an object into a certain state for a test. If you lock down your properties and constructor then this approach is not possible. This is not a bad thing. Yes it can make things slightly harder to set up but what you are doing is ensuring that your test is valid.

It also makes testing the logic within the domain objects themselves very simple. While the unit tests for your application service / command handlers will almost certainly require some level of mocking, you should find that most of your domain object tests are much simpler to construct and are often free from dependencies that require mocking.

Conclusion

This article showed three very simple techniques that you can use to move from an anemic domain model to a rich domain model when using Entity Framework and EF Core. The use of a parameterised constructor allows us to ensure that our domain model is valid when instantiated. The removal of public property setters ensures that our models remain in a valid state for their entire lifetimes. The introduction of methods to act on the domain model and internally perform validation and change state allows us to centralise business logic and simplify the calling code. Finally we looked at the use of value objects and explained how they can take this simplification and logic encapsulation one step further.

Useful or Interesting?

If you liked the article, I would really appreciate it if you could share it with your Twitter followers.

Share on Twitter

Comments

Avatar for Jon Hilton Jon Hilton wrote on 13 Oct 2017

Really good article.

This is something I've battled with for a few years. We've just started using EF Core for a project at work, and MediatR for representing user actions.

When these ideas (broadly categorised under the term DDD) come together, the difference is massive in terms of being able to understand what any given feature does, and also make safe changes to the business logic.

Thanks again!

Avatar for Dan Dan wrote on 13 Oct 2017

The entities represent the database. The domain solves the business problem. They are two different things. The entities should only be used to save and get data from the database. They shouldn't be used for business validation. That isn't their purpose. Once the domain objects are validated they should be mapped to the entities only for saving to the database. The only validation on those entities would be database specific like max length check or not nullabe, which probably should have been validated on the domain object anyway. EF was never meant to be used in place of business logic.

Avatar for Gilles Gilles wrote on 13 Oct 2017

@Dan: Yes if you want to model your application in a procedural way where obejcts could be replaced by structs and all classes and methods be made static. Because this is what happens with an anemic domain model.

You should look into domain modeling and DDD for a truly OO approach. OO isn't about dumbed down DTO POCOs and static methods, that's just putting procedural code inside classes and pretending the result is object oriented.

Avatar for Victor Victor wrote on 13 Oct 2017

"With public getters, you can simply assign individual values to any domain object property. "

should be

"With public setters, you can simply assign individual values to any domain object property. "

Good article, enjoyed reading it.

Avatar for Paul Hiles Paul Hiles wrote on 13 Oct 2017

@victor - thanks. updated.

Avatar for Wes Wes wrote on 13 Oct 2017

@Dan I've done something that (I think) combines your idea with the spirit of this article by implementing the memento pattern in my domain objects where I am able to do true business logic/validation in the domain following OOP practices discussed here and when I'm ready to persist the object's state to a data store that is where memento comes in. The domain object can give you it's current state via a .GetState() instance method and that state is (I think) what you are calling the entity. Once you fetch that state back from the data store you can re-instantiate the domain object via a .LoadFromState(state) static method (Customer.LoadFromState(state) for example).

Is that similar to what you are advocating?

Avatar for Aaron Aaron wrote on 15 Oct 2017

I agree with @Dan. And the procedural scenario you pose is absurd.

Avatar for Jason Jason wrote on 15 Oct 2017

My team and I have been migrating towards DDD from an anemic EF model in a legacy application. The approach we took was to keep the anemic EF model intact and just use it as an implementation detail on how the repositories persist to the database. We then created new rich domain models that the repositories and domain services work off of. I think it works nicely, as it completely isolates the EF entities from the domain models, although it results in more factory code at the repository level to convert between EF models and domain models.

My team and I are launching a new blog this week, and this will be one of the first concepts we cover. http://codingoncaffeine.com

Avatar for Nicholas Petersen Nicholas Petersen wrote on 15 Oct 2017

While I certainly appreciate many of the issues the author brings up - just this week some of these issues bugged me - I still feel the route he advocates will be much worse. For instance: Validation logic in every setter, really? That means simply retrieving a bunch of entities from the database (etc) will every time require the validation logic to run on every single property of every single entity *every time* it is retrieved or simply newed up? That is a terrible idea, and it will bring an enormous perf hit. Again, I appreciate some of these points, but if this is a high priority, then the route Jason advocated above seems like the better way to go.

Avatar for Paul Hiles Paul Hiles wrote on 15 Oct 2017

@Nicholas - while the article mentions property setter validation, it actually advocates the use of methods to update property values. If you want to talk about performance, having a domain model, a separate EF model and the necessary mapping code is not only more complicated but it will be slower due to the additional mapping. Besides if you are really interested in performance then you would probably want to use CQRS and bypass the domain model for querying.

I understand Jason's approach for a legacy application but fail to see any advantage to this model in other situations. Why go to the trouble of creating and maintaining two sets of classes and associated mapping code? It is certainly not necessary and is far more work so what is the big advantage that you are getting out of it? I genuinely have no idea and would be interested to hear some concrete reasons.

Avatar for David Keaveny David Keaveny wrote on 15 Oct 2017

I've worked with systems where the EF-generated classes (we were using a DB-first model) became not only our anemic domain objects but also API DTOs, which meant any change to the DB schema would result in a breaking API change (it was an outsourced/offshored project, and achieved new levels of notoriety for poor quality. OO is something that happens to other people, apparently). I definitely think of anemic domain objects as an anti-pattern.

That said, I fall more into @Dan's camp. These days I use Dapper rather than EF. Dapper concentrates on getting stuff out of your database as quickly as possible, so we have POCOs to serve as a model of our actual database schema (they are defined as internal, so people don't get tempted to use them as domain objects), and then get mapped to domain objects as quickly as possible, which themselves will get mapped to API DTOs later on, so there is the extra price to pay for all that mapping; but it keeps our externally-facing API more consistent and less likely to break when we introduce a new field to the database.

@Wes's suggestion of the memento pattern is interesting, but it means we would have to make our DB objects public rather than internal, so not a path I would want to go down.

Avatar for Jason Jason wrote on 15 Oct 2017

@Paul @Nicholas I’m an advocate of using the EF core entities as your domain models when starting a new application from the ground up; it’ll save a lot of time and code complexity at the repository level in the long run.

We solved the performance issue @Paul mentioned by having an in-memory cached repository proxy in front of the EF repository for the performance critical domain models in the application.

Avatar for Jarrod Jarrod wrote on 15 Oct 2017

@Paul, the biggest reason I advocate separating the domain from the EF objects is separation of concerns and bounded contexts. When doing database first modeling with EF, the default behaviour is to add navigation properties on to the objects for foreign key/primary key relationships. This in and of itself can easily violate aggregate root boundaries. You have to ensure that this option is turned off when generating the model from the database.

The next issue is that by modifying the generated entity model files, they could be accidentally overwritten by refreshing your models when making a change in the database.

Having separate domain objects that use the repository patterns for ferrying data to and from the database keeps our models nice and clean, allowing us to enforce aggregate root boundaries easier and puts us in a better position to mitigate breaking changes due to a technology change or new version of EF that isn't friendly to legacy code.

It's extra work and logic, but it protects our domain from outside issues and enforces an anti corruption layer incase we get some requirement that invalidates our use of EF.

Avatar for Paul Hiles Paul Hiles wrote on 15 Oct 2017

@David - Obviously I agree that an API requires a dedicated interface and using the domain model here is a terrible idea.

I also agree that when using a micro-ORM such as Dapper then more mapping may be required. However, EF is a fully-featured ORM that is perfectly capable of mapping between your database and your domain model directly. Yes, there are a few compromises necessary but creating a whole other layer and mappings seems a little over the top.

@Jarrod - thanks for your response. I understand what you are saying about database first and generated files. If you use this approach, I can see it being a real problem. I suppose I assumed that most people were now using the code first domain-centric approach.

Avatar for Jarrod Jarrod wrote on 15 Oct 2017

Yeah, my preference would be code first but with a previously established code base, it wasn't an option. This is the best way we could introduce DDD concepts into a previously static DAL accessors mentality. Your article is a fantastic read, thanks for taking the time to put it together.