What Exactly is a “Repository”?

The Repository pattern is one of the concepts frequently used and mentioned within Android Architecture Components. As English is not my first language, I’m drawing a blank in my head whenever I read or hear the term “repository” itself. I have some abstract idea about what it is, mostly from the concept of Github repositories: it seems to be a central place for something.

In my experience, not understanding an English term is a common barrier to having a better intuition of a programming concept. I would read about something new and would have a hard time connecting a term that drew a blank in my mind, to the larger understanding of the concept. So today I figured I’ll geek out a little bit and see what the “repository” word really means.

According to Merriam-Webster, the word means:

: a place, room, or container where something is deposited or stored

This definition matches my understanding of a Git repository: it’s a place where the code is stored.

What about when the term is used in Android Architecture Components? From this Codelab:

A repository class abstracts access to multiple data sources. The repository is not part of the Architecture Components libraries, but is a suggested best practice for code separation and architecture. A Repository class provides a clean API for data access to the rest of the application.


A Repository manages queries and allows you to use multiple backends. In the most common example, the Repository implements the logic for deciding whether to fetch data from a network or use results cached in a local database.

Uh, what? That sounds like a lot of things it’s doing other than just being a storage.

The Codelab also provides this graphic:

Repository concept in Android Architecture Components
Repository concept in Android Architecture Components

I found a clearer explanation from an article about the Repository pattern:

[The repository] will persist your objects sans the need of having to know how those objects would be actually persisted in the underlying database, i.e., without having to be bothered about how the data persistence happens underneath. The knowledge of this persistence, i.e., the persistence logic, is encapsulated inside the Repository.

In essence, the Repository design pattern facilitates de-coupling of the business logic and the data access layers in your application with the former not having to have any knowledge on how data persistence would actually take place.

In using the Repository design pattern, you can hide the details of how the data is eventually stored or retrieved to and from the data store. This data store can be a database, an xml file, etc. You can apply this design pattern to even hide how data that is exposed by a web service or an ORM is accessed. Martin Fowler states: “Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.”

So from the above, it sounds like the Repository pattern is more of a way to allow access to various data sources, without the accessor having to know about how the data sources are created. It’s not just a dumb place to store things, instead it’s a smart storage that allows someone to add or remove things as needed.

That mental model still makes sense for a Git repository as well: I add and remove code to my repository without really knowing how Github stores it: is it in a database? If yes, what database? I don’t need to know.

Picture of a library

From now on I figure I’d imagine the Repository pattern as a sort of librarian who I can talk to to help me find a book: I mention a title and they’d go ahead and fetch it for me. From my perspective it’s not important where the book is stored within the library (shelves? Fridge? Vault?), as long as I can borrow and return it properly. After all, a library is indeed a repository of books, yeah?