Design & Architecture

A Primer On Open Source NoSQL Databases

Posted on Updated on

The idea of this article is to understand NoSQL databases, its properties, various types, data model and how it differs from standard RDBMS.

1. Introduction

The RDMS databases are here for nearly three decades now.  But in the era of social media, smart phones and cloud, we generate large volume of data, at a high velocity.  Also the data varies from simple text messages to high resolution video files.  The traditional RDBMS could not able to cope up with the velocity, volume and variety of data requirement of this new era.  Also most of the RDBMS software are licensed and needs enterprise class, proprietary, licensed hardware machines.  This has clearly let way for Open Source NoSQL Databases, where the basic properties are dynamic schema, distributed and horizontally scalable on commodity hardware.

2. Properties of NoSQL

NoSQL is the acronym for Not Only SQL.  The basic qualities of NoSQL databases are schemaless, distributed and horizontally scalable on commodity hardware.  The NoSQL databases offers variety of functions to solve various problems with variety of data types, where “blob” used to be the only data type in RDBMS to store unstructured data.

2.1 Dynamic Schema

NoSQL databases allows schema to be flexible. New columns can be added anytime.  Rows may or may not have values for those columns and no strict enforcement of data types for columns. This flexibility is handy for developers, especially when they expect frequent changes during the course of product life cycle.

2.2 Variety of Data

NoSQL databases support any type of data.  It supports structured, semi-structured and unstructured data to be stored.  Its supports logs, images files, videos, graphs, jpegs, JSON, XML to be stored and operated as it is without any pre-processing.  So it reduces the need for ETL (Extract – Transform – Load).

2.3 High Availability Cluster

NoSQL databases support distributed storage using commodity hardware. It also supports high availability by horizontal scalability. This features enables NoSQL databases get the benefit of elastic nature of the Cloud infrastructure services.

2.4 Open Source

NoSQL databases are open source software.  The usage of software is free and most of them are free to use in commercial products.  The open sources codebase can be modified to solve the business needs.  There are minor variations in the open source software licenses, users must be aware of license agreements.

2.5 NoSQL – Not Only SQL

NoSQL databases not only depend SQL to retrieve data. They provide rich API interfaces to perform DML and CRUD operations. These are APIs are move developer friendly and supported in variety of programming languages.

3. Types of No-SQL

There are four types of No-SQL data bases. They are: Key-Value databases, Column oriented database, Document oriented databases and Graph databases.  At a very high level most of these databases follows the similar structure of RDBMS databases.

The database server might contain many data bases.  The databases might contain one or more tables inside it.  The table intern will have rows and columns to store the actual data.  This hierarchy is common across all No-SQL databases, but the terminologies might vary.

3.1 Key Value Database

Key-Value databases developed based on Dynamo white paper published by Amazon.  Key-Value database allows the user to store data in simple <key> : <value> format, where key is used to retrieve the value from the table.

3.1.1 Data Model

The table contains many key spaces and each key space can have many identifiers to store key value pairs.  The key-space is similar to column in typical RDBMS and the group of identifiers presented under the key-space can be considered as rows.01-KeyValueV1

It is suitable for building simple, non-complex, high available applications.  Since most of Key Value Databases support in memory storage, can be used for building cache mechanism.

3.1.3 Example:

DynamoDB, Redis

3.2 Column oriented Database

Column oriented databases are developed based on Big Table white paper published by Google.  This takes a different approach than traditional RDBMS, where it supports to add more and more columns and have wider table.  Since the table is going to be very broad, it supports to group the column with a family name, call it “Column Family” or “Super Column“.  The Column Family can also be optional in some of the Column data bases.  As per the common philosophy of No-SQL databases, the values to the columns can be sparsely distributed.

3.2.1 Data Model

The table contains column families (optional).  Each column family contains many columns.  The values for columns might be sparsely distributed with key-value pairs.

02-ColumnDBImgv1

The Column oriented databases are alternate to the typical Data warehousing databases (Eg. Teradata) and they are suitable for OLAP kind of application.

3.2.2 Example

Apache Cassandra, HBase

3.3 Document-oriented Database

Document oriented databases supports to store semi-structured data.  It can be JSON, XML, YAML or even a Word Document.  The unit of data is called document (similar to a row in RDBMS).  The table which contains a group of documents is called as a “Collection”.

3.3.1 Data Model

The Database contains many Collections.  A Collection contains many documents.  Each document might contain a JSON document or XML document or YAML or even a Word Document.

03-DocumentDBv1

Document databases are suitable for Web based applications and applications exposing RESTful services.

3.3.2 Example

MongoDB, CouchBaseDB

3.4 Graph Database

The real world graph contains vertices and edges.  They are called nodes and relations in graph.  The graph databases allow us to store and perform data manipulation operations on nodes, relations and attributes of nodes and relations.

The graph databases works better when the graphs are directed graphs, i.e. when there are relations between graphs.

3.4.1 Data Model

The graph database is the two dimensional representation of graph.  The graph is similar to table.  Each graph contains Node, Node Properties, Relation and Relation Properties as Columns.  There will be values for each row for these columns.  The values for properties columns can have key-value pairs. 04-GraphDBv1

Graph databases are suitable for social media, network problems which involves complex queries with more joins.

3.4.2 Example

Neo4j, OrientDB, HyperGraphDB, GraphBase, InfiniteGraph

4. Possible Problem Areas

Following are the important areas to be considered while choosing a NoSQL database for given problem statement.

4.1 ACID Transactions:

Most of the NoSQL databases do not support ACID transactions. E.g. MongoDB, CouchBase, Cassandra.  [Note: To know more about ACID transaction capabilities, refer the appendix below].

4.2 Proprietary APIs / SQL Support

Some of NoSQL databases does not support Structured Query Language, they only support API interface.  There is no common standard for APIs.  Every database follows its own way of implementing APIs, so there is a overhead of learning and developing separate adaptor layers for each and every databases.  Some of NoSQL databases do not support all standard SQL features.

4.3 No JOIN Operator

Due to the nature of the schema or data model, not all NoSQL databases support JOIN operations by default, whereas in RDBMS JOIN operation is a core feature.  The query language in Couchbase supports join operations.  In HBase it can be achieved by integrating with Hive.  MongoDB does not support it currently.

4.4 Lee-way of CAP Theorem

Most of the NoSQL databases, take the leeway suggested by CAP theorem and they support only any two properties of Consistency, Availability and Partition aware.  They do not support all the three qualities. [Note: Please refer appendix to know more about CAP theorem].

5. Summary

NoSQL databases solve the problems where RDBMS could not succeed in both functional and non-functional areas.  In this article we have seen the basic properties, generic data models, various types and features of NoSQL databases.  To further proceed, start using anyone of NoSQL database and get hands-on.

 

Appendix A Theories behind Databases

A.1 ACID Transactions

ACID is an acronym for Atomicity, Consistency, Isolation and Durability.  These four properties are used to measure

A.1.1 Atomicity

Atomicity means that the database transactions must be atomic in nature. It is also called all or nothing rule. Databases must ensure that a single failure must result rollback of the entire transaction until the commit point. Only if all transactions are successful the transaction must be committed.

A.1.2 Consistency

Databases must ensure that only valid data must be allowed to be stored. In RDBMS, it is all about enforcing schema. In NoSQL the consistency varies depends on the type of DB. For example, in GraphDB such as Neo4J, consistency ensures that relationship must have start and end node. In MongoDB, it automatically creates a unique rowid, using a 24bit length value.

A.1.3 Isolation

Databases allow multiple transactions in parallel. For example, when read and write operations happens in parallel, read will not know about the write operation until write transaction is committed. The read operation will have only legacy data, until the full commit of the write transaction is completed.

A.1.4 Durability

Databases must ensure that committed transactions are persisted into storage. There must be appropriate transaction and commit logs available to enforce writing into hard disk.

A.2 Brewer’s CAP-Theorem

The CAP theorem states that any networked shared-data system can have at most two of three desirable properties.  They are : Consistency, Availability and Partition tolerence.

A.2.1.Consistency

In a distributed database systems, all the nodes must see the same data at the same time.

A.2.2.Availability

The database system must be available to service a request received. Basically, the DBMS must be a high available system.

A.2.3. Partition Tolerance

The database system must continue to operate despite arbitrary partitioning due to network failures.

Microservices Design Principles

Posted on Updated on

The objective of this post is to understand micro services , relevant software architecture, design principles and the constraints to be considered while developing micro services.

1. Micro Services

Micro services are small autonomous systems that provide a solution that is unique, distinct within the eco-system. It runs as a full-stack module and collaborates with  other micro-services that are part of the eco-system.  Sam Newman defines micro services are “Small , Focused and doing one thing very well” in his book “Building Microservices”.

Micro services are created by slicing and dicing a single large monolithic system into many independent autonomous systems.  It can also be a plug-gable add-on component to work along with the existing system as a new component or as a green field project.

2. Eco system

Though the concept of micro service is not new, the evolution of cloud technologies, agile methodologies, continuous integration and automatic provisioning (Dev Ops) tools lead to the evolution of micro services.

2.1 Cloud Technologies

One of the important feature of cloud is “Elasticity”.  Cloud allows the user to dynamically scale up and scale down the performance and capacity of a system by dynamically increasing or decreasing the infrastructure resources such as virtual machines, storage, data base, etc.  If the software is one single large monolithic system, it cannot effectively utilize this capability of the cloud infrastructure, because the inner sub modules and communications pipe across the system could be the bottle neck, which could not scale appropriately.

Since the micro-services are small, independent and  full stack systems, it can efficiently use the elastic nature of the cloud infrastructure.    By increasing or decreasing the number of instance of a micro-service will directly impact the performance and capacity of the system proportionately.

2.2 Dev Ops

Dev Ops is a methodology focuses on speeding up the process of software development to customer deployment.  This methodology concentrates on improving the communication and collaboration between the software development and IT operations by integration, automation and cooperation.

Micro services architecture supports to meet both software engineers and IT professionals objective. Being small, independent component it is relatively easier to develop, test, deploy  and recovery (if failure) when compared to large monolithic architectures.

2.3 Agile Methodologies

Agile is the software development process model, which is evolved from Extreme Programming (XP) and Iterative and Incremental (2I) development process models.  Agile is best suitable for small teams, working on software deliverable where the requirement volatility is high and time to market is shorter.

As per the agile manifesto, agile prefers :

  • Individual interactions over Process and Tools
  • Working Software over comprehensive documentation
  • Customer Collaboration over contract negotiation
  • Responding to Change over following a plan

A small dynamic team which works in agile process model developing a micro service that is small, independent and full-stack application will have a complete product ownership with clear boundaries of responsibility.

3. Design of Micro Services

3.1 Characteristics of Micro Services

Micro services are designed to be small, stateless, in(ter)dependent & full-stack application so that it could be deployed in cloud infrastructure.

Small : Micro services are designed to be small, but defining “small” is subjective.  Some of the estimation techniques like lines of code, function points, use cases may be used, but it is not recommended estimation techniques in agile.

In the book Building Microservices the author Sam Newman suggest few techniques to define the size of micro service, they are : It should be small enough to be owned by a small agile development team,  re-writable within one or two agile sprints ( typically two to four weeks) or the complexity does not require to refactoring or require further divide into another micro service.

Stateless : A stateless application handles every request with the information contained only within it. Micro services must be stateless and it must service the request without remembering the previous communications from the external system.

In(ter)dependent : Micro services must service the request independently, but it may collaborate with other micro services within the eco-system.  For example, a micro service that generates a unique report after interacting with other micro services is an interdependent system. In this scenario, other micro services which only provide the necessary data to reporting micro services may be independent services.

Full-Stack Application : A full stack application is individually deploy-able. It has its own server, network & hosting environment.  The business logic, data model and the service interface ( API / UI) must be part of the entire system.  Micro service must be a full stack application.

3.2 Architecture Principles

Though SOA is one of the important architecture style helps in designing micro services.  There are few more architecture styles and design principles need to be considered while designing micro services.  They are:

3.2.1 Single Responsibility Principle (Robert C Martin)

Each micro-service must be responsible for a specific feature or a functionality or aggregation of cohesive functionality.  The thump rule to apply this principle is : “Gather those things which change for the same reason, Separate those things which change for the different reason”.

3.2.2 Domain Driven Design

Domain driven design is an  architectural principle in-line with object oriented approach. It recommends designing systems to reflect the real world domains.  It considers the business domain, elements and behaviors and interactions between business domains.  For example, in banking domain, individual micro services can be designed to handle various business functions such as retail banking, on-line banking, on-line trading etc. The retail banking micro-service can offer services related to that eg. open a bank account, cash withdraw, cash deposits, etc.

3.2.3 Service Oriented Architecture

The Service Oriented Architecture (SOA) is an architecture style, which enforces certain principles and philosophies.  Following are the principles of SOA to be adhered while designing micro-services for cloud.

3.2.3.1 Encapsulation

The services must encapsulate the internal implementation details, so that the external system utilizes the services need not worry about the internals. Encapsulation reduces the complexity  and enhances the flexibility (adaptability to change) of the system .

3.2.3.2 Loose Coupling

The changes in one micro-system should have zero or minimum impact on other services in the eco-system.   This principle also suggests having a loosely coupled communication methods between the micro services.  As per SOA, RESTful APIs are more suitable than Java RMI, where the later enforces a technology on other micro-services.

3.2.3.3 Separation of Concern

Develop the micro-services based on distinct features with zero overlap with other functions. The main objective is to reduce the interaction between services so that they are highly cohesive and loosely coupled. If we separate the functionality across wrong boundaries will lead tight coupling and increased complexity between services.

The above core principles of SOA provided only a gist of SOA.  There are more principles and philosophies of SOA which nicely fits into design principles of micro-services for cloud.

3.2.4 Hexagonal Architecture

This architecture style is proposed by Alistair Cockburn .  It allows an application to equally driven by users, programs, automated test or batch scripts, and to be developed and tested in isolation from its eventual run-time devices and databases.  This also called as “Ports-Adapters Architecture”, where the ports and adapters encapsulate the core application to function unanimously to external requests.  The ports and adapters handles the external messages and convert them into appropriate functions or methods exposed by the inner core application.  A typical micro service exposes RESTful APIs for external communication, message broker interface (eg. RabbitMQ, HornetQ, etc) for event notification and database adapters for persistence makes hexagonal architecture as a most suitable style for micro service development.

Though there are many architectural styles & principles the above items have high relevant to micro services.

4 Design Constraints

The design constraints (non-functional requirements) are the important decision makers while designing micro services.  The success of a system is completely depends on  Availability, Scalability, Performance, Usability and Flexibility.

4.1 Availability

The golden rule for availability says, anticipate failures and design accordingly so that the systems will be available for 99.999% ( Five Nines).  It means the system can go down only for a 5.5 minutes for an entire year.    The cluster model is used to support high availability, where it suggests having group of services run in Active-Active mode or Active-Standby model.

So while designing micro services, it must be designed for appropriate clustering and high-availability model.  The basic properties of micro-services such as stateless, independent & full stack will help us to run multiple instances in parallel in active-active or active-standby mode.

4.2 Scalability

Micro services must be scale-able both horizontally and vertically.    Being horizontally scale-able, we can have multiple instances of the micro-service to increase the performance of the system.  The design of the micro services must support horizontal scaling ( scale-out).

Also micro-services should be scale-able vertically (scale-in).  If a micro-service is hosted in a system with medium configuration such AWS EC2 t2-small  (1-core, 2-GB memory) is moved to M4 10x-large ( 40 core & 160GB memory) it should scale accordingly.  Similarly downsizing the system capacity must also be possible.

4.3 Performance

Performance is measured by throughput, response time (eg. 2500 TPS -transactions per second) .   The performance requirements must be available in the beginning of the design phase itself. There are technologies and design choices will affect the performance.  They are :

  • Synchronous or Asynchronous communication
  • Blocking or Non-blocking APIs
  • RESTful API or RPC
  • XML or JSON , choice of
  • SQL or NoSQL
  • HornetQ or RabbitMQ
  • MongoDB or Cassandra or CouchBase

So, appropriate technology and design decisions must be taken, to avoid re-work in the later stage.

4.4 Usability

Usability aspects of the design focuses on hiding the internal design, architecture, technology and other complexities to the end user or other system.  Most of the time, micro services expose APIs to the end user as well as to other micro-services.  So, the APIs must be designed in a normalized way, so that it is easy to achieve the required services with minimal number of API calls.

4.5 Flexibility

Flexibility measures the adaptability to change.  In the micro-services eco-system, where each micro-service is owned by different teams and developed in agile methodology, change will happen faster than any other systems.  The micro-services may not inter-operate if they don’t adapt or accommodate the change in other systems.  So, there must be a proper mechanism in place to avoid such scenarios, which could include publishing the APIs, documenting the functional changes, clear communication plans.

This briefly summarizes the important design constraints for micro-services.

5. New Problem Spaces

Though there are many positives with micro-services, it can create some new challenges.

5.1 Complete Functional Testing

The end to end functional testing will be a great challenge in micro-services environment, because we might need to deploy many micro-services to validate single business functionality. Each micro-service might have its own way of installation and configuration.

5.2 Data Integrity across the eco-system

Micro systems run independently and asynchronously,  they communicate each other through proper protocols or APIs. This could result in data integrity issues momentarily or out-of-sync due to failures. So we might need additional services to monitor the data integrity issues.

5.3 Increased Complexity

The complexity increases many folds, when a single monolithic is split into ten to twenty micro-services and introduction of load balance server, monitoring, logging and auditing servers in to the eco-systems increases the operational overhead.  Also the competency needed to manage and deploy the micro-services becomes very critical, where the IT admins and DevOps engineers need to be aware of plethora of technologies used by independent agile development teams.

The articles  “Microservices – Not a free lunch !”  and Service Disoriented Architecture clearly warns us to be aware of issues with micro services, though they greatly support and favour this architecture style.

6. Summary

Micro services architecture style offers many advantages, as we discussed it is most suitable for cloud infrastructure, speed up the deployment and recovery , minimizes the damages in case of failures.  This article consolidates the needed knowledge areas in design, architecture and design constraints for designing micro-services. Thank you.

7. References

[1] Domain Driven Design – Quickly http://www.infoq.com/minibooks/domain-driven-design-quickly

[2] MSDN Software Architecture – https://msdn.microsoft.com/en-us/library/ee658093.aspx

[3] Building Micro ServicesSam Newmann

[4] Hexagonal Architecture – http://alistair.cockburn.us/Hexagonal+architecture

 

 

Web App Authentication Methods

Posted on Updated on

Authentication Methods for Web Apps

1.    Authentication Methods

A user authentication mechanism specifies the way a user gains access to web content.  By specifying the authenticated mechanism, the user must be authenticated before access is granted to any resource that is protected by the security constraint.

I would broadly classify these authentication methods into two:

  • Credential Based authentication
  • Certificate Based Authentication

2. Credential Based Authentication:

The credential based authentication is considered to be the primitive method of authentication, where the user names, passwords (off-late we want to call it passphrase), and roles configured on the web server.  The web app must identify itself by sending username and password to the server, so that the server authenticates and authorizes the user based on the configuration to allow the web resource.

Following are the server based authentication mechanisms:

  • Basic Authentication
  • Form based authentication
  • Digest authentication

The HTTP basic authentication and form-based authentication are not secure authentication mechanisms.  In these authentication mechanisms the target server is not authenticated.  Basic authentication sends user name and passwords over the internet as base64-encoded text.  Form based authentication sends the same as plain text.  So, this data must be sent over a secure transport mechanism (SSL) to secure the user name & passwords being snooped over the network.

2.1 Basic Authentication

Http basic authentication requires that the server request a user name and password from the web client and verify the user name and password are valid by comparing them against a data base of authorized users.

BasicAuthentication

Following are the sequence of actions for Basic Authentication:

  1. Client requests access to a protected content
  2. The Server returns a dialog box that requests the user name and password
  3. The client submits the user name and password to the server
  4. Server authenticates the user by verifying the credentials in its data base

2.2 Form Based Authentication

Form based authentication allows the developer to control the look and feel of the login authentication screens by customizing the login screen and error pages that an HTTP browser presents to the end user.

FormBasedAuthentication

Following are the actions for form based authentication:

  1. Client requests access to a protected resource
  2. If the client is unauthenticated, the server redirects the client to a login page
  3. Client submits the login form to the server
  4. The server attempts to authenticate the user:
    1. If authentication succeeds, the authenticated user’s principal is checked to ensure that it is in a role that is authorized to access the resource.
    2. If the user is authorized, the server redirects the client to the resource requested by the client
    3. If authentication fails, the client is forwarded or redirected to an error page.

2.3 Digest Authentication

Like basic authentication, digest authentication authenticates a user based on a user name and a password.  But digest basic authentication does not send user name and password over the network in plain text.  The client sends a one-way cryptographic hash of the password and additional data.  Although pain text passwords are not sent on the wire, digest authentication requires that clear-text password equivalents be available to the authenticating container so that it can validate received authenticators by calculating the expected digest.

3. Certificate Based Authentication

In certificate based authentication, a certificate which is issued by a Certification Authrority is used to authenticate the user.  The certificate is kind of a digital passport, which is issued by authority, where authority has certified the user.

Following are the methods of Certificate Based Authentication:

  • Client Authentication
  • Mutual authentication

3.1 Client Authentication

Client authentication is a certificate based secure authentication methodology.  The client must have a public key certificate.  The public key certificate is a kind of a digital equivalent of a passport.  The certificate is issued by a trusted organization, called as Certificate Authority, who provides identification for the bearer

The web server authenticates the client by using the client’s public key certificate.  Client authentication is a more secure method of authentication than basic authentication methods.

It uses HTTP over SSL (HTTPS), in which the server authenticates the client using the client’s public key certificate.  The SSL technology provides data encryption, server authentication, message integrity and client authentication for TCP/IP connection.

Certificate Based Mutual Authentication

In Mutual authentication methods, both client & server authenticate each other.  Following are flow of events in Mutual Authentication based on certificates:

CertificateBasedMutualAuthentication

  1. Client requests access to a protected resource
  2. Webserver presents its certificate to the client
  3. Client verifies the server’s certificate
  4. If successful, the client sends its certificate to the server
  5. The server verifies the client’s credentials
  6. If successful, the server grants access to the protected resource

3.2 Mutual Authentication Based On User Name and Password

Following are the sequence of events in Mutual Authentication based on User Name and Password:

UserNameBasedMutualAuthentication

  1. Client sends the request to access the protected content
  2. Webserver presents its certificate to the client
  3. Client verifies the server’s certificate
  4. If successful, the client sends its username and password to the server
  5. The server verifies the client’s credentials
  6. If successful, server grants access to client on the protected content.

Summary

In this write up I have summarized the authentication methods.  Though I started studying the authentication methods for Web Apps, I don’t see any difference for authenticating Web App and Web Pages.  The manual from Java EE came as a very good reference for the above study.

Same Origin Policy & Exemptions

Posted on Updated on

The Web Browser Security

The Web browsers defend against malicious code by imposing restrictions against certain features.

  • Arbitrary file operations in Client Side JavaScript  (File Operations only on a sandbox environment)
  • General purpose networking capabilities (Web Sockets with restrictions only to send a string message)
  • Open a new browser window, only on response to user-initiated event (mouse click)
  • Close other browser window, only on user confirmation.
  • Cannot set value property of HTML File Upload elements
  • A script cannot read the content of documents loaded from different servers than the document that contains the script (The same origin policy)

In this write-up I’ll focus on SOP (Same Origin Policy & its exemptions).

Same Origin Policy

The Same-Origin policy is a security restriction on the web content the JavaScript code can interact with.  It affects the below:

  • Access restrictions on DOM elements of the document loaded from other host
  • Scripted HTTP request to other host

This comes into effect when a web page contains <iframe> element or opens other browser windows.  In this case, a script can read only the properties of windows and documents that have the same origin as the document that contains the script.

What Is Same Origin?

The origin of the document is defined as the protocol, host and port of the URL from which the document was loaded.

  • Documents loaded from web servers have different origins
  • Documents loaded through different ports of the same have different origins
  • Documents loaded using different protocols (http, https etc.) have different origins, even if they come from same host.

The Script Origin

It is important to note that the origin of the script is not relevant to the same-origin policy, what is targeted here is the origin of the document in which the script is embedded.

For example, a script hosted by Host A is included in a web page served by Host B.  The origin of the script is Host B and the script has full access to the content of the document that contains it.  If the script opens a third window and loads a document from host C into it, the same-origin policy comes into effect and prevents the scripts from accessing the document.

The same-origin policy affects all the properties of the document object of the window or frame that was loaded from different host.  For example, if you script opened the window, your script can close it, but it cannot access the properties of the document.

Relaxing the Same-origin Policy

In some situations the same-origin policy becomes restrictive.  So, HTML5/JavaScript provides ways to work around the restrictions, yet keeping the security intact.

Domain

The same-origin policy poses problems large websites that use multiple sub domains.  For example, a script in a document from home.example.com might want to read properties of a document loaded from developer.example.com or scripts from order.example.com.  But as per Same-origin policy this is restricted.

If two windows  (or frames) contains scripts that set “domain” to the same value, the same-origin policy is relaxed for these two windows, and each window can interact with the other.

For example, cooperating scripts in documents loaded from orders.example.com and catalog.example.com might set their “document.domain” properties to “example.com”.

Cross Origin Resource Sharing (CORS)

CORS allows Web Servers to expose services to be accessed by 3rd party applications and Web Applications which are hosted across the internet.

The CORS is a (draft) standard that extends HTTP with a new Origin: request header and a new Access-Control-Allow-Origin response header.  It allows the servers to use a header to explicitly list origins that may request a file or to use a wildcard and allow a file to be requested by any site.

Browsers use this new header to allow the cross-origin HTTP requests with XMLHTTPRequest that would otherwise have been forbidden by the same-origin policy.

Cors

To keep it simple:

  • If the User Agent ( Web Browser) identifies that the XHR request is a cross origin request, then it will send a HTTP request with “Origin” header, before actually sending a Get or Post.  In the “Origin” it will put origin of its original host.
  • In the Web Server / Web Host, on receiving HTTP request with “Origin” header, it will look for its configuration, whether that particular “Origin” can be allowed to access the specific resource.  If it is allowed, it send the response with “Access-Control-Allow-Origin” header, updated with the host name. Also sends the list of allowed HTTP commands (eg. GET, PUT, POST & DELETE).

Further Reading:

Example: http://www.html5rocks.com/en/tutorials/cors/

Security Issues: https://code.google.com/p/html5security/wiki/CrossOriginRequestSecurity

CORS desevers a separate write up with more details.  I’ll do it soon.

Cross Document Messaging

This mechanism allows a script from one document to pass text message to a script in another document, irrespective of the script origin.

Calling the postMessage() method on a Window object results in the asynchronous delivery of a message event.

The onmessage event handler to receive and message sent using postMessage().

Summary

The same origin policy is enabled by default in all the industry standard browsers.  So, by using “domain”, CORS and Cross Document Messaging we can work around the limitations of SOP without compromising the security of the web browser.