In a Fish Story, there are three phases for aiming the goals. It’s called downstream, midstream, and upstream. For this case, this Fish Story will be an analogy for Big Data.
In Big Data, downstream is a phase looking for data, and you could find data anywhere it could be. Midstream is a phase for processing data with any tools. And the last one, upstream. Upstream is a phase for processing and presenting the data into interactive one. The process is consist of data cleaning, data modelling, data visualization, and data tasting.
And now, MarkLogic comes to you as one of the tools we could use for optimization of midstream. MarkLogic, as a new generation database that addresses the complexity of ever-changing data and content—providing better access to your data and enabling faster development of mission-critical applications without sacrificing reliability or security.
So, let me show you how it works.
MarkLogic is an enterprise to be able to save, organize, and search:
- XML (eXtensible Markup Language) is kind of similar to HTML, but if HTML is focused on how data is displayed, while XML is a file format for storing and sending data and focus on what data is.
- Semantic data or RDF (Resource Description Framework) Triples, it consists of subject-predicate-object etc.
MarkLogic is made as a data hub or collecting data from various sources for the purposes of operational data (updating and processing data in real-time) and analytical (multi-dimensional analytical).
But before we continue to learn about MarkLogic, we have to know first the basic thing about the database: The difference between SQL and NoSQL.
So, what’s the difference between NoSQL and SQL?
SQL (Structured Query Language)
- SQL stands for a standard language for communicating with databases.
- SQL is a language standard for RDBMS (Relational Database Management System). Relational Database is primarily called as a number of data formed in tables and columns.
- SQL has a predetermined scheme.
- SQL is vertically scalable, so if you want to scale database for a bigger and larger one, you can do it by increasing hardware performance (CPU, RAM, etc).
NoSQL (Not Only SQL)
- The data design approach that can accommodate various types of data models, including documents, key-values, columns, graphics formats.
- NoSQL could categorize as a distributed database, a database consisting of two or more files from different sites, both from the same or different networks.
- NoSQL has a dynamic scheme for unstructured data.
- NoSQL was built in the 2000s for large-scale databases in cloud and web applications (Google, Amazon) for operational goals.
- NoSQL is horizontally scalable, so we can just add servers when there is large traffic.
Credit: Noor Azizah Rahmafani – MarkLogic Introduction – Tuesday Talk