The massive volumes data generated by modern interconnected systems and devices has spawned a new kind of database known as NoSQL. Perhaps the best known of this new breed of non-relational database is MongoDB. Unlike traditional relational databases (RDBMSes), MongoDB does not contain tables. Instead, it stores data as collections of documents. Show In the blog, we learned how to create a new database and collection using the Navicat for MongoDB database management & design tool. In today's follow-up, we'll learn about MongoDB documents and add some to our collection. While MongoDB shares some of the same terms as those of traditional RDBMSes, others are unique to NoSQL databases. To help clarify, here's a table that compares RDBMS terminology to that of MongoDB: RDBMSMongoDBDatabaseDatabaseTableCollectionTuple/RowDocumentcolumnFieldTable JoinEmbedded DocumentsPrimary KeyPrimary Key (Default key _id is provided by mongodb)MongoDB stores data as documents. BSON is a binary representation of JSON documents, though it contains additional data types, in addition to JSON. MongoDB documents are composed of field:value pairs and have the following structure: { field1: value1, field2: value2, field3: value3, ... fieldN: valueN } The value of a field can be any valid BSON data type, including other documents, arrays, and arrays of documents. Here's and example of a document that contains information about an American city. Notice the different data types: // 1 { "_id": "01005", "city": "BARRE", "loc": [ -72.108354, 42.409698 ], "pop": NumberInt("4546"), "state": "MA" } // 2 { "_id": "01012", "city": "CHESTERFIELD", "loc": [ -72.833309, 42.38167 ], "pop": NumberInt("177"), "state": "MA" } // 3 //etc... In the last blog, we created a database named "my_mongo_db" and collection named "my_first_collection". Now, we'll add some data to the collection in the form of documents.
You can add more documents by following the same process as above: Now that we've learned how to add documents to our collection, in the next blog, we'll cover how to view, delete, and edit documents in Navicat for MongoDB. This tutorial also assumes that a MongoDB instance is running on the default host and port. Assuming you have downloaded and installed MongoDB, you can start it like so: $ mongod Making a Connection with MongoClientThe first step when working with PyMongo is to create a to the running mongod instance. Doing so is easy: >>> from pymongo import MongoClient >>> client = MongoClient() The above code will connect on the default host and port. We can also specify the host and port explicitly, as follows: >>> client = MongoClient('localhost', 27017) Or use the MongoDB URI format: >>> client = MongoClient('mongodb://localhost:27017/') Getting a DatabaseA single instance of MongoDB can support multiple independent databases. When working with PyMongo you access databases using attribute style access on instances: >>> db = client.test_database If your database name is such that using attribute style access won’t work (like >>> from pymongo import MongoClient >>> client = MongoClient()9), you can use dictionary style access instead: >>> db = client['test-database'] Getting a CollectionA collection is a group of documents stored in MongoDB, and can be thought of as roughly the equivalent of a table in a relational database. Getting a collection in PyMongo works the same as getting a database: >>> collection = db.test_collection or (using dictionary style access): >>> collection = db['test-collection'] An important note about collections (and databases) in MongoDB is that they are created lazily - none of the above commands have actually performed any operations on the MongoDB server. Collections and databases are created when the first document is inserted into them. DocumentsData in MongoDB is represented (and stored) using JSON-style documents. In PyMongo we use dictionaries to represent documents. As an example, the following dictionary might be used to represent a blog post: >>> import datetime >>> post = {"author": "Mike", ... "text": "My first blog post!", ... "tags": ["mongodb", "python", "pymongo"], ... "date": datetime.datetime.utcnow()} Note that documents can contain native Python types (like instances) which will be automatically converted to and from the appropriate BSON types. Inserting a DocumentTo insert a document into a collection we can use the method: $ mongod0 When a document is inserted a special key, >>> client = MongoClient('localhost', 27017)2, is automatically added if the document doesn’t already contain an >>> client = MongoClient('localhost', 27017)2 key. The value of >>> client = MongoClient('localhost', 27017)2 must be unique across the collection. returns an instance of . For more information on >>> client = MongoClient('localhost', 27017)2, see the documentation on _id. After inserting the first document, the posts collection has actually been created on the server. We can verify this by listing all of the collections in our database: $ mongod1 Getting a Single Document WithThe most basic type of query that can be performed in MongoDB is . This method returns a single document matching a query (or >>> client = MongoClient('mongodb://localhost:27017/')0 if there are no matches). It is useful when you know there is only one matching document, or are only interested in the first match. Here we use to get the first document from the posts collection: $ mongod2 The result is a dictionary matching the one that we inserted previously. Note The returned document contains an >>> client = MongoClient('localhost', 27017)2, which was automatically added on insert. also supports querying on specific elements that the resulting document must match. To limit our results to a document with author “Mike” we do: $ mongod3 If we try with a different author, like “Eliot”, we’ll get no result: $ mongod4 Querying By ObjectIdWe can also find a post by its >>> client = MongoClient('mongodb://localhost:27017/')4, which in our example is an ObjectId: $ mongod5 Note that an ObjectId is not the same as its string representation: $ mongod6 A common task in web applications is to get an ObjectId from the request URL and find the matching document. It’s necessary in this case to convert the ObjectId from a string before passing it to >>> client = MongoClient('mongodb://localhost:27017/')5: $ mongod7 See also Bulk InsertsIn order to make querying a little more interesting, let’s insert a few more documents. In addition to inserting a single document, we can also perform bulk insert operations, by passing a list as the first argument to . This will insert each document in the list, sending only a single command to the server: $ mongod8 There are a couple of interesting things to note about this example:
Querying for More Than One DocumentTo get more than a single document as the result of a query we use the method. returns a instance, which allows us to iterate over all matching documents. For example, we can iterate over every document in the >>> db = client.test_database5 collection: $ mongod9 Just like we did with , we can pass a document to to limit the returned results. Here, we get only those documents whose author is “Mike”: >>> from pymongo import MongoClient >>> client = MongoClient()0 CountingIf we just want to know how many documents match a query we can perform a operation instead of a full query. We can get a count of all of the documents in a collection: >>> from pymongo import MongoClient >>> client = MongoClient()1 or just of those documents that match a specific query: >>> from pymongo import MongoClient >>> client = MongoClient()2 Range QueriesMongoDB supports many different types of advanced queries. As an example, lets perform a query where we limit results to posts older than a certain date, but also sort the results by author: >>> from pymongo import MongoClient >>> client = MongoClient()3 Here we use the special >>> db = client.test_database9 operator to do a range query, and also call to sort the results by author. IndexingAdding indexes can help accelerate certain queries and can also add additional functionality to querying and storing documents. In this example, we’ll demonstrate how to create a unique index on a key that rejects documents whose value for that key already exists in the index. First, we’ll need to create the index: >>> from pymongo import MongoClient >>> client = MongoClient()4 Notice that we have two indexes now: one is the index on >>> client = MongoClient('mongodb://localhost:27017/')4 that MongoDB creates automatically, and the other is the index on >>> db = client['test-database']2 we just created. |