This documentation refers to an unsupported open source library made available by the Shibboleth Project as a service to the community. It is not an official software product of the project and does not have formal releases, and is generally untested by the development team. It is available under the standard Apache 2.0 software license.
Amazon Redshift is a data warehousing service available through AWS. Like most AWS services, there are a variety of security models available and like most databases, it supports built-in user and group management features. Unlike most services and unlike essentially all databases, it also supports a mechanism for leveraging AWS' SAML support for federated login by database clients using Amazon's JDBC and ODBC drivers.
This article describes an open source Java plugin written by a member of the Shibboleth team that interfaces to the Amazon JDBC driver and to the Shibboleth IdP (or any SAML compliant IdP) using the ECP (Enhanced Client or Proxy) profile designed for non-browser SAML authentication. There are some advanced features that are designed in part to support some Shibboleth features, but it's largely a vanilla Java ECP client that implements a now-publically committed API in the Amazon JDBC driver to pass it back a SAML response from the IdP for use by AWS. It also makes use an HTTP client library embedded inside the JDBC driver jar, to limit the additional dependencies needed.
This is where all the hard stuff is, and there's no way I can come close to documenting it all clearly or accurately. Get help from Amazon if you want to do all this safely. There is a really good chance you'll end up with a security hole if you don't know what you're doing with AWS Roles and Policies, and if you don't test exhaustively. You have been warned.
This document doesn't (yet anyway) cover every aspect of making Redshift (or AWS) work with your IdP, so there's a lot of leg work assumed, but the highlights are:
You need an IdP definition/trust established with the AWS account to support SAML use.
You need an IAM Role defined in AWS with a trust policy that is usable by that IdP and attached to a resource Policy that grants access under at least some conditions to a Redshift cluster. That's a "deep" topic and Amazon has extended their documentation with more examples, but a simple policy we've tested with looks like this:
This example is very "open" in that it grants access to any cluster, but it's "closed" in that it restricts the user account in the database used to one that pre-exists and that has a name matching the name inside the SAML Subject <NameID> element. This ties thing together in a way that's suitable for using the database as a control point, rather than auto-creating users. Groups in the databases can also be used since the user accounts are 'long lived" and not based on temporary names.
You need an IdP configured to supply the "one, big, honking" AWS SAML SP (which Amazon identifies as "urn:amazon:webservices"). You need to modify the metadata you give the IdP for this AWS SP so that the IdP believes it can respond to it using the ECP "PAOS" binding. Adding the following as an AssertionConsumerService is sufficient:
The IdP must be configured to pass a number of specialized, proprietary Attributes that carry the matching IAM role expression, a RoleSessionName identifier for logging, and possibly a custom DbUser attribute containing the Redshift username and a <NameID> value matching the DbUser attribute with certain approaches such as the one shown above.
There are other ways to do this safely, such as tying the database username in the policy to the "aws:userid" policy key that should be derived from the RoleSessionName SAML Attribute Amazon defines.
All of these things have to line up for things to work. Amazon's documentation covers all the gory details of what these SAML Attributes have to be named and what they contain and it's possible to screw up and leave holes where users can override the user identity with the JDBC driver if not done carefully and tested well.
As a general matter, it's best to get the AWS web console working with your IdP with regular SAML IdP-initiated SSO and a working IAM Role before tackling Redshift. Then the jump to supporting the additional bits is smaller.
The following steps are needed to get a client ready for use:
Configure a custom JDBC data source type in your Java tool of choice.
Step 3 is the wildcard, it's specific to the client tool used. All JDBC client applications typically have some special way they allow you to define "non-standard" driver types for use somewhere in their litany of menu options. When you define a custom data source like this, you get to point it to the set of jars that make up the "driver", in this case the two jars needed. Often you can also create some templates for new connections to use with common properties but this is generally just optional and is best ignored while testing things out.
Some of these are more useful than others but you'll need to set the user and password at some point, or enter them in real time.
One special extension point is the "ecp_headers" pointer to a property file, which can carry custom HTTP headers. This is particularly useful to provide special authentication features such as Multi-Factor login signaling to the IdP.
Port for the IdP
Path to IdP's ECP endpoint
Username at IdP (NOT the Redshift username)
Password at IdP
Classpath resource containing an ECP AuthnRequest template
XML Element tagname used by IdP for SAML Response
Pathname to properties file containing custom HTTP request headers to include