Archive for the ‘REST’ Category

Ramp Up for QCon 2012 San Francisco   Leave a comment

I am getting ready to attend QCON 2012 in San Francisco, this will be my third QCon SF. You can read my posts for the 2011 edition here and here.  I enjoy QCon and find the tutorials especially useful.  But let me say right off that I don’t attend on my own dime.  In the past I have attended QCon and Microsoft’s MIX conference.  Sadly, Microsoft has decided to kill MIX, this is bad for Microsoft and for the developer community.  I have written posts about MIX here.  Now that MVC 4 is open sourced, one wonders why Microsoft corporate has killed MIX.

QCon is run by the same folks you put out the web magazine InfoQ.  For my money (and its not) in the past QCon schedule is a little long on Agile and Java and not strong enough on JavaScript libraries. The 2012 tracks rectify this problem somewhat. What QCon is strongest is for its embrace of open and free (or almost free) technologies to power the web next year.  This years tracks include a reprise of “Cross Platform Mobile”, “Programmable Web”, “No SQL” (on all three days) and a bevy of tracks on hard core web development ( “Real Time Web” , “UX”, “Taming HTML5 and JavaScript” and “Dynamic Languages for the Web” plus the usual cast of characters: Java, Agile and what looks like a strong three day track of end to end “solutions”. Strangely JQuery is not featured anywhere in this conference.

The tutorials include a full day session on cross platform development  using Phonegap.  Robinson and Weber who presented an excellent and well attended session on RESTful Development will return with a Neo4j programming class.  With excellent timing Peter Bell will present a four hour tutorial on CoffeeScript.  If there was a MIX 2012, they would be talking about how MVC4 is supporting CoffeeScript (sigh).  Track and tutorials can be found here and here.  Note the venue has also changed this year from the centrally located, and very urban, Union Square area of SF to the more up scale Embarcadero area. For folks who like to explore the city, as opposed to pub crawling, this is a bummer.  But it is ocean view and close to the (now departed) Occupy SF site.

Well, that’s the basic facts, now the question is: Should YOU attend QCon 2012?  The best part of QCon is that it is not a standard vendor conference and (except for Agile) nobody is trying to sell you anything.  The attendee’s tend to be working programmers from startups in the SF area and from Europe which helps the sessions to be very focused on what really works. for programmers and startup firms.  There is no focus on mega systems like Oracle products or SharePoint.  In this environment, couchdb (and this years bad boy Neo) are about as establishment as you get. QCon has traditionally been very friendly towards NO SQL databases and has been I consistent good sorce for information on this topic.  This year promises to continue in this trend.   I regularly avoid pure vendor sessions so of the three days of regular conference sessions I can look forward to about 2.5 days of good sessions.  The tutorials that I have attended are top notch and honest.  I can not evaluate the daily keynote speeches since I am normally sleep in past these.  Attendees have told me the keynote party and mixer is nice but I am always in the Haight district when this happens on Wednesday night.  If the boss is paying and you are coding the web and web devices you must attend this conference.  If it’s your dime….

I will be bloging QCon 2012 on my Nexus 7

On Adding Custom Restful Web Services to SharePoint 2010   2 comments

 

At my shop we have data which resides in SQL Servers and FileNet.  We want this data to be callable from the browser during Sharepoint sessions and we desire this data to be called in a RESTful manner with the results returned in JSON format.  Using Microsoft’s (open sourced) WebAPI in the wild would be simple, easy and fun.  But this is SharePoint and the Framework version supported by SharePoint is (necessarily) Framework 3.5.  So we must use the WCF API and package the delivery of the endpoints using Farm solution.  Ouch.  Without getting into the precise definition what Rest is as a practical matter. for this project,  this means that the browser client calls into the SharePoint web server with an HTTP Get call of the form:

http://YOURSHAREPOINTSERVER:80/_vti_bin/Endpoints.svc/sp/INQ_AGENCY_AGCY_DETAIL?agcy_cd=123

and the results returned are in pure JSON format.

When we work on Sharepoint we normally use a version of CKS – Development Tools Edition extension for Visual Studio.  This is available here as a vsix file.   Although there are several blogs and (older) PDS presentations on custom web services our touch stone blog is Sohels’ Blog.  We learned most of what we know from this blog.  We must step lightly as we develop these Web Services since small errors lead to Http 500 and 404 errors which can drive you CRAZY.  So lets get our ducks in order and move through the processes systematically.  Microsoft support several different types of Web Service factories as documented here and summarized as:

  • SOAP service:  MultipleBaseAddressBasicHttpBindingServiceHostFactory
    • Basic HTTP binding must be used, which creates endpoints for a service based on the basic HTTP binding.
  • REST Service:  MultipleBaseAddressWebServiceHostFactory
    • The service factory creates endpoints with Web bindings.
  • ADO.NET Data Service: MultipleBaseAddressDataServiceHostFactory
    • A data service host factory can be used.

For our purposes we are only interested in the REST Service Factory.

Begin the process by creating a new solution and adding a farm solution as a Sharepoint 2010 project:

image

image

You can change the Site URL later is you need to. It is the Project Property called: Site URL (select the project and press F4).  Be sure that the Assembly name and default namespace are set as you want them before adding any project items. (Project Properties / Open):

image

Now add a Web Service item to this project.  We will use the CKSDev Web Template (on Visual Studio 2010 you may need to search for this template):

image

Open the EndPoints.svc file and change the default Service Factory from MultipleBaseAddressBasicHttpBindingServiceHostFactory to MultipleBaseAddressWebServiceHostFactory .

Now open the interface file (IEndpoints.cs) and replace the entire contents with:

using System.ServiceModel;
using System.ServiceModel.Web;
using System.ServiceModel.Activation;
using System.Web;

namespace Z
{
[ServiceContract]
public partial interface IEndPoint
[OperationContract]
[WebInvoke( UriTemplate = "SP/Ping", Method = "GET", BodyStyle = WebMessageBodyStyle.Bare, RequestFormat = WebMessageFormat.Json, ResponseFormat = WebMessageFormat.Json )]
string Ping();

}

}

//mind the word wrap on the WebInvoke line

The file will define what the endpoint will look like.  W are starting with the minimal Endpoint we can define and will add a real world example later in this post.  The attributes ServiceContract and OperationContract should be familiar to you.  The WebInvoke attribute is unique for a HTTP GET Restful call defining the path fragment (SP) the public method name (Ping) and the Format (JSON).  The internal method name is (here) the same as the public method name (but they need not be the same).  Likewise,  the URI path fragment can be omitted.

Open the EndPoints.svc.cs file and replace the contents with:

using Microsoft.SharePoint.Client.Services;
using System.ServiceModel.Activation;
using System.ServiceModel;
using System.ServiceModel.Web;

namespace Z
{
[AspNetCompatibilityRequirements( RequirementsMode = AspNetCompatibilityRequirementsMode.Required )]
public partial class EndPoint : IEndPoint
{
public EndPoint( )
{//this code is always called for EACH method call

        }
public string Ping(  )
{
return “Pong: “;
}

}

}

We want ASP Net Compatibility so application and session cache are available to us.

Now lets set up the deployment package in this project.   Double click on the project folder: Package.  If all has gone well the package design surface should look like this:

image

Do a final build on the project, right click on the project and select Deploy.  Watch your output window you should see a successful Deployment.

Now lets build a simple test program to test this:

use either a console or a Windows Form project including the following code fragments

using System;
using System.IO;
using System.Net;
using System.Text;
using System.Windows.Forms;

//your prolog code goes here

string urlString=@”http://{myserver}/_vti_bin/Endpoints/SP/Ping”;

worker( urlString );

private void worker(string urlString )
{
const int _MaxRedirect=4;
const int _MaxHeadersLen=4;
string responseString=string.Empty;
HttpWebResponse response=null;
try
{
HttpWebRequest myRequest =( HttpWebRequest ) WebRequest.Create( urlString );
myRequest.MaximumAutomaticRedirections = _MaxRedirect;
myRequest.MaximumResponseHeadersLength = _MaxHeadersLen;
myRequest.Method = “GET”;
myRequest.UseDefaultCredentials = false;
myRequest.PreAuthenticate = true;

myRequest.Credentials = new NetworkCredential( {yournetworkname},{yournetworkpassword},{yourdomain} );
response = ( HttpWebResponse ) myRequest.GetResponse( );
Stream receiveStream=response.GetResponseStream( );
StreamReader readStream = new StreamReader( receiveStream, Encoding.UTF8 );
string thing=readStream.ReadToEnd( );
responseString = thing;
}
catch ( Exception xx )
{
string httpStatusCode=”Unknown”;
string httpStatusText=”Unknown”;
if ( response != null )
{
try
{
httpStatusCode = response.StatusCode.ToString( );
httpStatusText = response.StatusDescription;
responseString = httpStatusCode + ” – ” + httpStatusText;
}
catch ( Exception xxx )
{
responseString += xxx.Message;
}
}
else
{
responseString += xx.Message;
}
}
finally
{
}

}
}
//mind the word wrap

Everything here should be familiar to you.  The only funny part is the actual URL.

string urlString=@http://{myserver}/_vti_bin/Endpoints.svc/SP/Ping;

The _vti_bin is a virtual folder within the SharePoint application which maps to a real folder:

_vti_bin = {yourdefaultdriveletter}:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\isapi

We can move the actual endpoints to a folder within _vti_bin as will be demonstrated later.  OK run your test program and lets see if you get back the desire “Pong” response.  I’ll wait.

OK everything running smoothly?  lets move the endpoints to a new folder and add a endpoint which does something which does something. In Visual Studio click on the project and select RETRACT.  Watch your output window for success.  To change the folder location to a folder under _vti_bin: click on the .svc file and select properties.  Expand the Deployment Location and add a folder under ISAPI in the Path Property:

image

Our full endpoint path now would be (after deployment): http://{myserver}/_vti_bin/BlogTest.SharePoint.Services/Endpoints.svc/SP/Ping.  Build and deploy the project.  Change your path in the test program and test.  Good Luck.

Now lets add a more useful endpoint.  Retract the solution. Open the interface file (EndPoints.cs) and add code with the IEndpoint interface like:

[OperationContract]
[WebInvoke(Method = "GET", UriTemplate = "SP/mymethod?ID={ID}", BodyStyle = WebMessageBodyStyle.Bare, RequestFormat = WebMessageFormat.Json, ResponseFormat = WebMessageFormat.Json)]
myreturnclass mymethod(string ID);

Note that myreturnclass must include appropriate attributes to be serializable as JSON.  We do this with code like this:

[DataContract]
public class myreturnclass
{
[DataMember]
public string myID { get; set; }

     public myreturnclass(){string ID){

            myID=ID;

}

}

Now open the implementation file (Endpoints.svc.cs) and add code like the following to the EndPoints class:

public myreturnclass mymethod (string ID)   {

return new myreturnclass(ID);

}
Build the project and Deploy.

Add a second urlString for this new method to your test file and retest.  Good Luck.

Have fun.  I know I did.

 

ASP.Net Web API: Cross-Domain AJAX and Server Techniques: JSONP and CORS   6 comments

AJAX

AJAX calls make the modern web go around.  Interactive web applications make AJAX calls using the HTTP protocol to call HTTP Endpoints on servers to deliver new content to the browser without round tripping the page.  Normally we try to architect Restful endpoints on Web API servers.  Although the X in AJAX once meant XML is now stands for nothing since most AJAX calls over the web package the data as JSON, not XML. JSON eliminates the need for complex proxy code required for SOAP and XML and assures fidelity between the data format and the consuming code (JavaScript) within the browser.  We let the Web be the Web, Restful HTTP rules and life was good. Although one can make AJAX calls in JavaScript using fairly low level calls with vendor specific code most of us use a helper library of one type or another.  I normally use JQuery as my browser abstraction layer in order to minimize browser differences and I find their code to be pretty darn smart.  Much smarter than my JavaScript code.  As we toyed around with this AJAX thing it is a logical extension of the concept to attempt to make AJAX calls to servers NOT in the domain which served up the page to the browser.  That’s is the WEB right we want to connect data from all over the web not just from the server which originally delivered our page.  In formal terms this is know as Cross Domain Resource Sharing and without special cooperation between the browser code, the browser and the server code.

The Server Side

To enable AJAX calls to domains other than the server of the calling page requires that the server support either JSONP or CORS. Of the two methods, CORS is the newest and the preferred method.  However CORS is not supported by all browsers and only on some versions of supporting browsers.  JSONP can be used from all browsers except Opera.  Some history is in order here, the W3C which at one time controlled (or at least managed) these things was, prior to the development of smart phones (i.e. iPhone), pretty slow moving.  Indeed the “Same Origin Policy” expletively forbad cross domain resource sharing.  Although there was quite a bit of pressure since the turn of the century for cross domain AJAX no new spec was forth coming. To work around this clever coders (on the server and browser sides) took what was once a security  exploit (the “cross-site request forgery”) and turned it into a virtue (JSONP).  Eventually the W3C developed a “Working Draft” on CORS (the original draft is dated March 2009 and the most recent draft is April 2012).  As this draft stabilized,  server code could be developed and browser vendors could enable CORS support.  OK, lets look at some server code.  I am going to use examples for GET request with NO security (clearly the simplest case, I will follow up with folding POST and security in a later blog entry).

Controllers On ASP.Net Web API

A bare bone Web API Get controller for ASP.Net MVC 4 might look like:

DeadBase2.DAL.DeadBase2Entities  _DBContext = new DAL.DeadBase2Entities( );

public IEnumerable JSONP( string MM, string DD )
{
return _DBContext.ViewConcerts.ToList( )
.Where( c => string.Compare( c.YYYYMMDD.Substring( 4, 4 ), MM + DD ) == 0 ).OrderBy( c => c.YYYYMMDD );
}

This will return JSON if the caller includes the media type header with the argument:

text/javascript

In this form the controller can return JSON but it does not yet support COR or JSONP.

We can limit the controller to HTTP GET requests by adding the attribute

[System.Web.Mvc.AcceptVerbs( HttpVerbs.Get )]

There have been some attempts to develop an attribute to limit controller methods to AJAX only calls.  Since these are usually based on examining header values I take of dim view of this since headers from the caller can be faked and falsified.

Let’s expand the controller model to include the ability to return HTTP Error Codes

public HttpResponseMessage<ienumerable> JSONP( string MM, string DD )
{
HttpResponseMessage<IEnumerable<DAL.ViewConcert>> msg;
_Concerts = _DBContext.ViewConcerts.ToList( )

                   .Where( c => string.Compare( c.YYYYMMDD.Substring( 4, 4 ), MM + DD ) == 0 ).OrderBy( c => c.YYYYMMDD );
if ( _Concerts.Count( ) == 0 )
{
msg = new HttpResponseMessage<IEnumerable<DAL.ViewConcert>>( System.Net.HttpStatusCode.Forbidden );
}
else
{
msg = new HttpResponseMessage<IEnumerable<DAL.ViewConcert>>( _Concerts );
}
return msg;
}

These changes will allow us to  return HTTP error codes to AJAX and CORS calls.  JSONP callers will not (more on this below). receive these codes.

Enabling CORS On The Server

To enable CORS on the server side all we need to doo is provide appropriate W3C recommended server-side headers to the response just prior to returning from the method.  The most important of these is the “Access-Control-Allow-Origin”.  To allow all CORS callers the argument value is the wild card:

Access-Control-Allow-Origin: *

to limit the callers to specific cross-domain callers, like:


Access-Control-Allow-Origin: http://example.com:8080 http://foo.example.com

In our model method controller  we simply add the following line just prior to the return statement:

 msg.Headers.Add( “Access-Control-Allow-Origin”, “*” );

Browser Side Code for CORS

Simple, easy and fun.  This header allows the browser code to allow the return data stream to passed to your application code (your JavaScript).  Not all browsers support the working draft.  Currently the list includes:

Gecko 1.9.1 (Firefox 3.5,[3] SeaMonkey 2.0[4]) and above.

WebKit (Initial revision uncertain, Safari 4 and above,[1] Google Chrome 3 and above.

MSHTML/Trident 4.0 (Internet Explorer 8 and Explorer 9) provides partial support via the XDomainRequest object.[1]

Assuming you are working with a conformant browser a CORS call in JQuery looks like:

function CallAJAX (){
$.ajax(
{
type: “GET”,
url: “
http://localhost:8080/deadbase2api/api/Concerts/MMDD/12/31″,
dataType: “json”,
success: function (data) { alert(‘Success AJAX’); alert(‘Objects Returned: ‘ + data.length); },
error: function (XMLHttpRequest, textStatus, errorThrown) { alert(“Error AJAX”); }
}
);
}

 

See this link for more JQuery details.  If you are not using IE8 or IE9 that’s it. Done.  If you are using IE8 or IE9 things an additional steps are required.  Since IE8 and IE9 use XDomainRequest object for CORS calls JQuery $.ajax calls will not work out of the box.  You must include a JQuery plug-in (Get a local copy from here).  Include this file after you have loaded the JQuery library (we load our JQuery at run time from a Google edge server).  Now the above $.ajax code will work on all CORS supporting  browsers.  Two more notes on IE8 and IE9, in my tests IE9 and IE8 will not trapped HTTP errors in CORS calls (strange but true).  The somewhat mythical IE10 reportedly will not use XDomainRequest object and will be conformant with the WebKit model in which case the extra JQuery plug in will not be necessary.  We shall see.

Enabling JSONP on the Server

A cross-domain JSONP AJAX call includes a query string argument named “callback” with an argument determined by the caller.  In our example code the calling might compose something like this:

http://localhost:8080/deadbase2api/api/Concerts/MMDD/12/31?callback=mycallbackfunction

and pass the proper JSON media type header (text/javascript).  It’s the servers job to do two tasks:

  • process the call and output JSON formatted data; and
  • wrap the JSON data in the argument of the callback function like this: mycallbackfunction({JSON DATA goes here})

Inside of the controller method we could read the argument of the query string argument and generate the necessary string.  A slightly more elegant approach is to add JSONP formatter to do this for all method calls which meet the criteria for a JSONP response. Alex Zeitler has contributed to GitHub an excellent JSONP formatter which can be hooked into the Web API system.  Get it here.  His blog post on the formatter is here.  This guy rocks.  Pull his code into a code file in your Web API project (change the namespace if you need to).  Add the formatter to your application_start event in global.asax file.  My application_start code looks like this:

protected void Application_Start( )
{
AreaRegistration.RegisterAllAreas( );

          RegisterGlobalFilters( GlobalFilters.Filters );
GlobalConfiguration.Configuration.Formatters.Insert( 0, new DeadBase2.Formatters.JsonpMediaTypeFormatter( ) );
RegisterRoutes( RouteTable.Routes );

          BundleTable.Bundles.RegisterTemplateBundles( );
}

This formatter is at the model level, that is it will apply to all controller methods which the formatter tests as qualifying for JSONP formatting.

Browser Side Code For JSONP

OK now you are good to go to respond to JSONP requests. All browsers that we know of, except Opera, support JSONP calls for cross-domain resource calls.  In JQuery the call looks like:

function CallJSONP() {
$.ajax(
{
type: “GET”,
url: “
http://localhost:8080/deadbase2api/api/Concerts/MMDD/12/31″,
dataType: “jsonp”,
success: function (data) { alert(‘Success JSONP’); alert(‘Objects Returned: ‘ +data.length); },
error: function (XMLHttpRequest, textStatus, errorThrown) { alert(“ERROR JSONP”); alert(textStatus); } //see note below
}
);
}

JQuery will generate the appropriate query string and provide a unique random name to use as the argument for the callback key in the query string.  Note that JQuery uses “jsonp” as the argument for “datatype” it will use this key to generate a query string which includes a random argument for the callback function, provide a json media type header and handle decoding the response and calling the success function.  For most browsers HTTP errors of all types will not be trapped and the error function will never be called (this includes 404 errors!).  Internally generated errors with the controller method (like the System.Net.HttpStatusCode.Forbidden in the model code above) will generate a trappable error in IE8 and IE9 only – however the error returned is NOT the HTTP Error code but a strange “parse error” message.  All of this weirdness about error code trapping is caused by the browser code not the Server code and not the JavaScript code.  A review of the wire traffic using Fiddler (you do use Fiddler for debugging don’t you) shows that the proper HTTP Error codes are returned to the browser, the browser just thrown them away.  AJAX calls which are not cross domain do receive the HTTP status codes properly. Strange.

Cross Domain Error Response Summary:

  Chrome IE9 FoxFire Safari
CORS Trapped Not Trapped Trapped Trapped
JSONP Not Trapped Trapped* Not Trapped Not Trapped

 

Ok. That’s All Folks.

Web API: HTTP Restful Endpoints, The WCF Way (Part III)   Leave a comment

Part I: Overview and Serialization
Part II: WCF Endpoints (This Post)

Part III: Implementing WCF Endpoints (This Post).

Our concern in this post is how to dynamically create RESTful HTTP endpoints.  Our design goals are:6676388189

  • host the Endpoints in IIS
  • create the endpoints at run time using C# code
  • zero annotations  added to web.config
  • ease and clarity of coding

The work here is based on code formats presented in Flanders’ RESTful .NET and code presented in the first two parts of this series.  Flaunder’s model code looks like this:

ServiceHost sh = new ServiceHost(typeof(HostingExample));

ServiceEndPoint se=sh.AddServiceEndPoint(typeof(HostingExample),new WebHttpBinding(),“http://localhost:8080/Hosting”);

se.Behaviors.Add(new WebHttpBehavior());

sh.Open();

This is pretty straight forward (thank you, Anders Hejlsberg). The down side is the tight binding between that seems to require a match between the class type used with the ServiceHost

new ServiceHost(typeof(HostingExample))

 and the type applied to the ServiceEndPoint

AddServiceEndPoint(typeof(HostingExample),new WebHttpBinding(),“http://localhost:8080/Hosting”)

Take on its face value this would require us to have a new ServiceHost for each endpoint we wanted to create, not good.  We, following, Flanders, can do better.  Watch carefully, its all done with mirrors.

Step 0: Define an Interface for each endpoint you wish to implement, in our case one of these would look like:

[ServiceContract]
public interface IGetTest
{
[OperationContract( )]
[WebGet( UriTemplate = "*" )]
Message AllURIs( Message msg );
}

Step 1: Declare your class as implementing the interface and as derived from the base class (ala Part II):6676386467

[ServiceContract]
[AspNetCompatibilityRequirements( RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed )]
public partial class GetTest : CGetBaseClass, IGetTest { }

Note here this is declared as a partial class, CGetBaseClass (see Part II) is the class from which this endpoint class is derive and IGetTest is the interface with GetTest implements.  The body of this class is left blank!

See here for notes on AspNetCompatibilityRequirementsMode.

Step 2: Define the contents of your new class:

public partial class GetTest
{
protected override object GetObject( IncomingWebRequestContext requestCTX, NameValueCollection queryParameters)
{
//Your working code goes here

        }
}6676385385

Step 3: Create a class which defines your interface:

public partial class CommonGetInterfaceClass : IGetTest
{
Message IGetTest.AllURIs( Message msg )
{

            GetTest getClass=new GetTest( );
return getClass.AllURIs( msg );
}
}

Note that is defined as a partial class (we will come back to this below). 

Step 4: Now you can dynamically create the your Endpoint as:

          ServiceHost = new WebServiceHost( typeof(CommonGetInterfaceClass) );
ServiceEndpoint se=ServiceHost.AddServiceEndpoint( typeof(IGetTest), new WebHttpBinding( ), “http://localhost/Endpoint/Test” );
ServiceEndPoint.Add( se );
ServiceHost.Open( );

Step 5: Repeat Steps 0 through Step 3 for each additional endpoint you which to define.  For this to work you must define a unique Interface for each class you which to implement as an endpoint.  Since we have defined CommonGetInterfaceClass as a partial class we can repeat step 3 as many times as we like in our code (with a different Interface annotation) and the Framework will pull these together for us. If you are unclear on partial class see this article.  Now our Step 4 looks like:

        ServiceHost = new WebServiceHost( typeof(CommonGetInterfaceClass) );
ServiceEndpoint se=ServiceHost.AddServiceEndpoint( typeof(IGetTest), new WebHttpBinding( ), “http://localhost/Endpoint/Test” );

        ServiceEndpoint se2=ServiceHost.AddServiceEndpoint( typeof(IGetTest2), new WebHttpBinding( ), “http://localhost/Endpoint/Test2” );
ServiceEndPoint.Add( se );

        ServiceEndPoint.Add( se2 );

        ServiceHost.Open( );

The only caveat here is that each endpoint must share a common root url.  In our example this ishttp://localhost/Endpoint/”. 

Run this code within global.asax in the Application_Start method and you are all set.

OK! This is good, RESTful endpoint creation looks more like a Bagel factory rather than a nerd fest.

Security

The only security issue we are required to touch on in this series is the security requirements needed for dynamic endpoint creation.  Since we are hosting our EndPoints with in IIS we are concerned with the security of the application pool associated with the running code.  Typically this will be the recommend identity “Network Service”, unfortunately, out of the box this identify does not have the authorization to run the code in Step 4.  What to do?  There are two choices: run the application pool in the “Local System” identity (not recommended) or reserve the base URL of our endpoints using httpcfg.exe (Windows Server 2003) or with netsh in Server 2008 or above.  Microsoft gives an overview of this process in this article.   Here dear reader is how to run this command for to reserve URL endpoints for dynamic use by “Network Service”:

httpcfg.exe set urlacl /u http://+:80/EndPoint/ /a O:AOG:DAD:(A;;RPWPCCDCLCSWRCWDWOGA;;;S-1-5-20)

On Windows Server 2008 (or above) the command is the simpler:

netsh http add urlacl url=http://+:80/EndPoint/ user=DOMAIN\user

Run this command while logged in with the equivalence of local System Administrator. Watch your return code you must get a zero value for this command to be a success.  Run this once only (you do not need to repeat this one successive reboots. (There is an unsupported GUI for this called httpconfig.exe you may want to look at).

OK! Was this fun or what?

6676381047

 

Web API: HTTP Restful Endpoints, The WCF Way (Part II)   Leave a comment

Part I: Overview and Serialization
Part II: WCF Endpoints (This Post)

Part III: Implementing WCF Endpoints

RESTful GET Requests in WCF

WCF for REST provides a strong set of tools to develop HTTP Endpoints in a RESTful manner, optimized for being called by Browsers and returning JSON in the response body.  In this post will will look at how to develop WCF RESTful endpoint to respond to HTTP GET request.  Our emphasis will be on:6732995643_61267eb04f_b

  • Serialization
    • XML
    • JSON
    • Text
  • Header processing
    • Accept Headers
  • Dynamic HTTP Endpoint creation
    • Simple methods to make endpoints visible
  • Ease of Development
    • Consistency
    • Clarity

We will not deal in this post with HTTP POST or with security  (authentication and authorization) issues.  From 10,000 feet the processing tasks preformed during and HTTP GET request are:

  • Authentication
  • Header Processing
  • Passing Control to a a Worker object
    • Authorization
    • return a generic Object To the HTTP Get Processor
  • Error Processing
  • Serialization
    • Based on Accept Header Arguments

IIS (Microsoft’s Web Server) handles HTTP request using HTTP.SYS (IIS 6, IIS 7).  IIS handles requests as a stream the key process in which the key developer hook into the system is with the method call:6733008895_88c6b2e785_b

public Message AllURIs( Message msg ) where Message is defined  as:

using System;
using System.Runtime.Serialization;
using System.ServiceModel;
using System.Xml;

namespace System.ServiceModel.Channels
{
// Summary:  Represents the unit of communication between endpoints in a distributed environment.
public abstract class Message : IDisposable
{…}

}

Note dear reader that Message is an object which represents a SOAP envelope.  RESTful WCF allows us to generate a non-SOAP response type and eliminating the overhead of the XML SOAP envelop and return pure  XML, JSON or Text.  Our task is to create our own ALLURIs method which will return a Message class of our own design.  When a request is in the stream and (our version of) ALLURIs is called the request is running in a particular Context. The following calls make this context available to us:

WebOperationContext webCtx = webCtx = WebOperationContext.Current;
IncomingWebRequestContext requestCTX=webCtx.IncomingRequest;
OutgoingWebResponseContext  outgoingCtx = webCtx.OutgoingResponse;

IncomingWebRequestContext.Method allows us to access what type of request we are processing (GET, POST, etc).

IncomingWebRequestContext.Headers allows us to access the Headers associated with the request.

IncomingWebRequestContext.UriTemplateMatch.QueryParameters allows us to access the query parameters associated with the request.6733010987_52213894d1_b

To process any incoming message our first tasks will be to confirm the request type (i.e. An HTTP GET request) and handle an invalid request type appropriately.  A typical error response is to set the HTTP Status Code to a well defined value and return a null Message object.

if ( requestCTX.Method != “GET” )
{
outgoingCtx.StatusCode = System.Net.HttpStatusCode.BadRequest;

               return null;

}

Accept Headers

Our next task is to process any request Headers which are important to us.  In our case we want to process the Accept Header to see how we will serialize the output object.  We can locate the Accept header several ways, but WCF provides a helper method to ease the process:

IncomingWebRequestContext.Accept

Accept headers contain a list of acceptable formats of the response body (our Message class) as defined my the caller (typically the Browser).  These are known as MIME Types and fall into three categories:  well known types (such as application/xhtml+xml, vendor defined types (like application/vnd.myformat) and the generic “give me anything” type (*/*). An accept header is NOT mandatory and the Accept header may (often will) contain more than one MIME type.  WCF does not provide a helper function to parse the Accept Header but since it is a simple coma delimited string we can call a string.Split function to get an array of string accept values. Once we know what responses the caller (browser) can accept our job is to return our C# object in the proper serialized form.  I will discuss three serialization, JSON (application/json), XML (application/xml+xhtml) and text(text/plain). We assume here that the C# object to be serialized has been annotated with the appropriate DataContract and XML Attributes (See Part I).  Our basic tool is the Message method CreateMessage. This method is overloaded and we will use this form:

public static Message CreateMessage(
	MessageVersion version,
	string action,
	Object body,
	XmlObjectSerializer serializer
)

Let cObject represent the object we wish to serialize our basic calls is:

Message msg=Message.CreateMessage(MessageVersion.None,”*”,cObject,{our serializer goes here});

So our goal now is to create a serializer for each of the types we wish to support. Lets start with the JSON serializer (have you been waiting all this time for this?):

private static XmlObjectSerializer _JSONSerializer( object msg )
{
WebBodyFormatMessageProperty formatter=new WebBodyFormatMessageProperty( WebContentFormat.Json );
OperationContext.Current.OutgoingMessageProperties.Add( WebBodyFormatMessageProperty.Name, formatter );
WebOperationContext.Current.OutgoingResponse.ContentType = “application/json”;
return new DataContractJsonSerializer( msg.GetType( ) );

       }

and now our call then becomes:

Message msg=Message.CreateMessage(MessageVersion.None,”*”,cObject,_JSONSerializer( cObject ));

Our XML serializer is just as simple:

private static XmlObjectSerializer _XMLSerializer( object msg )
{
WebBodyFormatMessageProperty formatter=new WebBodyFormatMessageProperty( WebContentFormat.Xml );
OperationContext.Current.OutgoingMessageProperties.Add( WebBodyFormatMessageProperty.Name, formatter );
WebOperationContext.Current.OutgoingResponse.ContentType = “application/xml+xhtml”;
return new DataContractSerializer( msg.GetType( ) );
}

and now our call then becomes:

Message msg=Message.CreateMessage(MessageVersion.None,”*”,cObject,_XMLSerializer( cObject ));

The only “hard” serializer is for plain text (text/plain). For this we need to first replace the default BodyWriter method with our own:

public class TextBodyWriter : BodyWriter
{
byte[] messageBytes;
public TextBodyWriter( string message )
: base( true )
{
this.messageBytes = Encoding.UTF8.GetBytes( message );
}
protected override void OnWriteBodyContents( System.Xml.XmlDictionaryWriter writer )
{
writer.WriteStartElement( “Binary” );
writer.WriteBase64( this.messageBytes, 0, this.messageBytes.Length );
writer.WriteEndElement( );
}
}

then our text serializer becomes:

public static Message CreateRawMessage( object msg )
{
try
{
string sMsg=msg.ToString( ); //See Note Below
               Message reply=Message.CreateMessage( MessageVersion.None, null, new TextBodyWriter( sMsg ) );
reply.Properties[WebBodyFormatMessageProperty.Name] = new WebBodyFormatMessageProperty( WebContentFormat.Raw );
WebOperationContext.Current.OutgoingResponse.ContentType = “text/plain”;
return reply;
}
catch ( Exception xx )
{
throw new ApplicationException( “CreateRawMessage”, xx );
}
}

and now our call then becomes:

Message msg=Message.CreateMessage(MessageVersion.None,”*”,cObject, CreateRawMessage( cObject ));

Note: In order to make this code as generic as possible I am overloading the ToString method of the C# Object to generate the appropriate text output we want.  Here is our example object from Part I with the ToString method overwritten:6733014367_0beaa22071_b

[DataContract]
public class CObject{
[DataMember]
public string ID { get; set; }
[DataMember]
public string Name { get; set; }
[DataMember]
public List<string> Data { get; set; }
public CObject( ){
Data = new List<string>( );
}

  public string ToString(){

      return string.Format(“ID:{0} Name: {1}”,ID,Name);

   }

}

We can wrap this all up into a base class to make deriving new endpoints as follows:

public class CGetBaseClass
{
public Message AllURIs( Message msg )
{

Message response=null;

//process security here

//process headers here

//call a method to do the actual work

//pass in the headers (in case there are special headers to be processed by the worker method

//pass in the query parameter to be processed by the worker method

//call a virtual function

object thing=GetObject( requestCTX, requestCTX.UriTemplateMatch.QueryParameters);

//call the appropriate serializer here

//return the properly formatted response Message

outgoingCtx.StatusCode = System.Net.HttpStatusCode.OK;

return Message;

}

protected virtual object GetObject( IncomingWebRequestContext requestCTX, NameValueCollection queryParameters )
{
throw new ApplicationException( “GetObject Not Implemented”);
}

}

Note we have make the GetObject method virtual so for any given endpoint which we which to implement we can derive from CGetBaseClass and simply overriding the GetObject method. As:

public partial class GetMySpecialObject : CGetBaseClass
{
protected override object GetObject( IncomingWebRequestContext requestCTX, NameValueCollection queryParameters)
{ /* Your Code Here */   }

}

Ok That gets us up to actually implementing an Endpoint. And that is the story for the next post!


6733016423_4812a29d6c_b




	

QCON 2011 San Francisco and Occupy California   2 comments

Let me say write off that I do not pay for my own ticket to QCON, my boss picks up the tag.  I love QCON.  It is definitely not MIX. I go there to see what is happening in the world which 6439629043_9a7e84a2bd_z is NOT Oracle and Not Microsoft.  That’s the same reason I read their online Zine: InfoQ.   QCon always provides a look at what is current and recent in the open stack world.  This year we looked closely at REST, Mobile development, Web API and NOSQL. As they did last  year QCON provides a nice look at what is open and emerging.  Big metal with always be with us but the desk top is looking6373613127_9780c7d60f very weak during the next few years while Mobile devices of all kinds and makers are exploding.  The biggest fall out is that while HTML5 is only slowly emerging on desktops in place, all new Mobile devices (which is to say most new systems) will be fully HTML5 compliant.  Not only that but with the exception of Windows Phones, the rendering engine for all mobile devices is based on WebKit.  What this mean for those of us in the cubes is that worrying about how to bridge to pre-HTML5 browsers with HTML5 code is a non-issue.  Mobile development is HTML5 development.  The big metal end of the supply chain is being segmented into Web API servers (which service JSON XHR2 data calls) and the NOSQL engines which serve the WEB API farms.  Remember a native mobile app     ideally has pre-loaded all of its pages its interactions are solely over JSON XHR2 for data (be it documents, data or HTML fragments).  The traditional JSP or ASPX web server is not really in play with native mobile apps and has and increasingly small role to play in “native like” or browser based mobile apps.  Let’s move on.

“IPad Light by cloud2013”

Speaking of moving on: There is an occupation going on in this country.  I visited occupations sites in San Francisco, UCal Berkeley and  Berkeley “Tent City”.  These are all very active and inspiring occupy sites.  Now if we can only get to Occupy Silicon Valley! 

I attended the REST in Practice tutorial this year and it was a very nice.  The authors were well informed and the agenda comprehensive.  I personally like the Richardson maturity model but think that people are not facing up to the fact that level three is rarely achieved in practice and the rules of web semantics necessary to interoperate at level 3 are almost non-existent. Remember the original REST model is client/server.  The basic model is a finite state machine and the browser (and the user) are in this model required to be dumb as fish.  Whether Javascript is a strong enough model and late binding semantics can be made clear enough to pull off level three is really an open question which no one has an answer to.  If we forget about interoperability (except for OAuth) things start to fall into place but we thought OPENNESS was important to REST.

Workshop: REST In Practice by the Authors: Ian Robinson & Jim Webber

Why REST? The claims:

· Scalable

· Fault Tolerant

· Recoverable

· Secure

· Loosely coupled6439625819_5705585c80

Questions / Comment:6380018433_9172323197

Do we agree with these goals?

Does REST achieve them?

Are there other ways to achieve the same goals?

REST design is important for serving AJAX requests and AJAX requests are becoming central to Mobile device development, as opposed to intra-corporate communication. See Web API section below.

Occupy Market Street (San Francisco)            

The new basic Document for REST: Richardson Maturity Model (with DLR modifications)

Level 0:

One URI endpoint

One HTTP method [Get]

SOAP, RPC

Level 1:

Multiple URI,

One HTTP Method [Get]

Century Level HTTP Codes (200,300,400,500)

Level 2:

Multiple URI,

Multiple HTTP Methods

Fine Grain HTTP Codes (“Any code below 500 is not an error, it’s an event”)

URI Templates

Media Format Negotiation (Accept request-header)

Headers become major players in the interaction between client and server

Level 3:  The Semantic Web

Level 2 plus

Links and Forms Tags (Hypermedia as the engine of state)

Plus emergent semantics

<shop xmlns=”http://schemas.restbucks.com/shop&#8221;

xmlns:rb=”http://relations.restbucks.com/”&gt;

<items>

<item>…</item>

<item>…</item>

</items>

<link rel=”self” href=http://restbucks.com/quotes/1234 type=”application/restbucks+xml”/>

<link rel=”rb:order-form” href=”http://restbucks.com/order-forms/1234″ type=”application/restbucks+xml”/&gt;

</shop>

6439622787_7b614f312c

Think of the browser (user) as a finite State Machine where the workflow is driven by link tags which direct the client as to which states it may transition to and the URI associated with each state transition.6380028389_e64c6a826f

The classic design paper on applied REST architecture is here: How To GET a Cup Of Coffee. Moving beyond level 1 requires fine grain usage of HTTP Status Codes, Link tags, the change headers and media type negotiation. Media formats beyond POX and JSON are required to use level 3 efficiently (OData and ATOM.PUB for example).

Dude, where’s my two phase commit? Not supported directly, use the change headers (if-modified, if-non-match, etag headers) or architectural redesign (redefine resources or workflow). Strategic choice is design of the finite state machine and defining resource granularity.

clip_image002

(Slide from Rest in Practice)

Architectural Choices:

The Bad Old Days: One resource many, many ‘verbs’.

The Happy Future: Many, many resources, few verbs.

The Hand Cuff Era: Few Resources, Few verbs.

The Greater Verbs:

GET: Retrieve a representation of a resource

POST: Create a new resource (Server sets the key)

PUT: Create new resource (Client sets the key); ( or Update an existing resource ?)

DELETE: Delete an existing resource

Comment: The proper use of PUT vs. POST is still subject to controversy and indicates (to me) that level 3 is still not well defined.

Typically they say POST to create a blog entry and PUT at append a comment to a blog. In Couchdb we POST to create a document and PUT to add a revision (not a delta) and get back a new version number. The difference here is how the resource is being defined, which is an architectural choice.

6439621853_3275941633

The Lesser Verbs:

OPTIONS: See which verbs a resource understands

HEAD: Return only the header (no response body)

PATCH: Does not exist in HTML5. This would be a delta Verb but no one could agree on a specification for the content.  Microsoft did some early work on this with their XML Diffgram but no one else followed suit.

Security

Authentication (in order of increased security)

Basic Auth

Basic Auth + SSL

Digest

WSSE Authentication (ATOM uses this)

Message Security:

Message Level Encrypt (WS-SEC)

For the Microsoft coders I highly recommend

RESTful .Net (WCF For REST (Framework 3.5) Jon Flanders

There are significant advantages to building your RESTful services using .Net.  Here is a comparison table to get you oriented:

DLR’s Cross Reference:
Web Service Standard REST Service WCF For REST (Framework 3.5)
1 TCP/IP + others TCP/IP TCP/IP
2 SOAP Wrapper HTTP HTTP
3 SOAP Headers HTTP Headers HTTP Headers
4 WS*Security Basic Auth/SSL Basic Auth/SSL or WS*Security
5 Early Binding Late Binding Late Binding
6 XSD WADL XSD, WADL
7 XML Media Negotiation Media Negotiation
8 SOAP FAULTS HTTP Response Codes HTTP Response Codes
9 Single Endpoint Multiple Endpoints, URI Templates Multiple Endpoints, URI Templates
10 Client Proxy Custom auto-generated Javascript proxy

6439623457_d2599a85eb_m

The REST of the Week

Wednesday is more or less vendor day at QCON and the sessions are a step down from the tutorials but the session quality6373577519_b3a8be078c picked up again on Thursday and Friday.  XXX XXXX who gave an excellent tutorial last year gave an informative talk on ‘good code’.  The Mobile Development and HTML5 tracks were well attended and quite informative.  The fie   ld is wild open with many supporting systems being free to the developer (support will cost you extra) and the choices are broad: from browser ‘responsive design’ application to native appearing applications to native apps ( and someone threw in “hybrid app” into the mix).  The Mobile panel of IBM DOJO, JQuery.Mobil and Sencha was hot.  I am new (to say the least) to Mobile development but here are my (somewhat) random notes on these sessions:

MOBILE Development is HTML5 Development

HTML5 is the stack. Phone and Tablet applications use WebKit based rendering engines and HTML5 conformant browsers only (Windows Phone 7 is the exception here). HTML5 has its own new security concerns ( New Security Concerns)

Three major application development approaches are:

· Browser Applications;

· Native like Applications;

· Hybrid Applications; and

· Native Applications.

Browser applications may emulate the screens seen on the parallel desk top browser versions on the front end but in practice the major players (Facebook, YouTube, Gmail) make substantial modifications to at least the non-visual parts of the Mobile experience making extensive use of local storage and the HTML5 manifest standard for performance and to allow for a reasonable off line experience. Browser applications fall under the guidelines of Responsive Design (aka adaptive Design) and tend to be used when content will appear similarly between desktop and Mobile devices.

“Native like” applications use:

· The Browser in full screen Mode with no browser ‘chrome’; and

· Widgets are created using CSS, JS and HTML5 which simulate the ‘look and feel’ of a native application;

· No Access to Native Functionality (GPS, Camera, etc)6380026599_db3ba709db

· Tend to use, but does not require use of HTML5 manifest or local storage but it is strongly encouraged. 6439624411_22b452613f

A Native application is still an HTML5 application with the following characteristics:

· All JS Libraries, CSS and HTML are packaged and pre-loaded using a vendor specific MSI/Setup package;

· AJAX type calls for data are allowed;

· Access to Native Widgets and/or Widgets are created using CSS, JS and HTML5

· Access to Native Functionality (GPS, Camera, etc)

· Standard HTTP GET or POST are NOT allowed

A Hybrid Application is a “Native Like” Application” placed within a wrapper which allows access to device hardware and software (like the camera) via a special JavaScript interface and, with additional special coding, can be packaged within a MSI/Setup and distributed as a pure Native application.

AJAX calls are made via XHR2 (aka XMLHttpRequest Level 2) which among other things relaxes the single domain requirement of XHR and processing Blob and File interfaces.

The following major vendors offer free libraries and IDE for development:

Native Apps: PhoneGap, Appcelerator

Native App Like: Sencha, PhoneGap, IBM Dojo

Browser App: JQuery.Mobile

PhoneGap does NOT require replacement of Sencha, JQuery.Mobil, Dojo.Mobile JQuery libraries.

PhoneGap allows JavaScript to call PhoneGap JavaScript libraries which abstract access to device hardware (camera, GPS, etc).

Sencha does not require replacement of the JQuery.Mobil, Dojo.Mobile JQuery libraries.

Although it is theoretically possible to create “Native like” applications with only JQuery.Mobile this is NOT encouraged.6439625143_caa6996f39 6337926187_91ca36793d

Local Storage

This is a major area of performance efforts and is still very much open in terms of how best to approach the problem:

The major elements are:

App Cache (for pre-fetch. and Native App Approach)

DOM Storage (aka Web Storage)

IndexedDB (vs. Web SQL)

File API (this is really part of XHR2)

Storing Large Amounts of Data Locally

If you are looking to store many Megabytes – or more, beware that there are limits in place, which are handled in different ways depending on the browser and the particular API we’re talking about. In most cases, there is a magic number of 5MB. For Application Cache and the various offline stores, there will be no problem if your domain stores under 5MB. When you go above that, various things can happen: (a) it won’t work; (b) the browser will request the user for more space; (c) the browser will check for special configuration (as with the “unlimited_storage” permission in the Chrome extension manifest).

IndexedDB:

clip_image002

Web SQL Database is a web page API for storing data in databases that can be queried using a variant of SQL.

Storage Non-Support as of two weeks ago.

IE Chrome Safari Firefox iOS BBX[RIM] Android
IndexedDB Supported Supported No Support Supported No Support No Support No Support
WEB SQL No Support Supported Supported No Support Supported Supported Supported

6439626605_ee7b664332

Doing HTML5 on non-HTML5 Browsers: If you are doing responsive design and need to work with Desktop and6380016957_4c6b5e7345_z Mobil using the same code base: JQuery.Mobile, DOJO and , Modernizr(strong Microsoft support for this JavaScript library).

WEB API

What is it? Just a name for breaking out the AJAX servers from the web server. This is an expansion of REST into just serving data for XHR. It is a helpful way to specialize our design discussions by separating serving pages (with MVC or whatever) from serving data calls from the web page. Except for security the two can be architecturally separated.

Web APIs Technology Stack

clip_image002[6]

Look familiarr? Looks like our old web server stack to me.

NOSQL

The CAP Theorem  (and Here)

  • Consistency: (all nodes have the same data at the same time)
  • Availability: (every request receives a response – no timeouts, offline)
  • Partition tolerance: (the system continues to operate despite arbitrary message loss)

Pick Any Two6439627917_7f88626477_z

If some of the data you are serving can tolerate Eventual Consistency then NOSQL is much faster.6380029445_0e0ecf7d53

If you need two phase commit, either use a SQL database OR redefine your resource to eliminate the need for the 2Phase Commit.

NoSQL databases come in two basic flavors:

Key/Value: This are popular with content management and where response time must be minimal. In general you define what btrees you want to use before the fact. There are no on the fly Joins or projects. MongoDB and CouchDB are typical leaders in this area.

Column Map: This is what Google calls Big Table. This is better for delivering groups of records based on criteria which may be defined ‘on the fly’. Cassandra is the leader in this group.

Web Sockets:

6439628517_6c7955df1f_zSad to say this is still not standardized and preliminary support libraries are still a little rough.  Things do not seem to have moved along much since the Microsoft sessions I attended at MIX 11.

Photos: All Photos by Cloud2013

Microsoft MVC 3 and CouchDB – Low Level Get Calls   1 comment

I have written elsewhere on couchdb on Windows and using Ruby on Rails to interface to this system.  These posts can be found here:couchdb

Part 0 – REST, Ruby On Rails, CouchDB and Me

Part 1 – Ruby, The Command Line Version

Part 2 – Aptana IDE For Ruby

Part 3 CouchDB Up and Running on Windows

Part 4 – CouchDB, Curl and RUBY

Part 5 – Getting The Data Ready for CouchDB

Part 6 – Getting The Data Into And Out Of CouchDB

Part 7 – JQUERY,JPlayer and HTML5

In my work life I work in a Microsoft shop which for us means Microsoft servers for the back end and (mostly) pure HTML/AJAX frontends.  We are transitioning towards using Microsoft MVC 3 to provide HTTP end points for our AJAX calls.  Here are some notes from my POC work in this area.  My couch data consists of documents describing Grateful Dead concerts stored on the great site Internet Archive, if you have never visited the Internet Archive, please do so.  I back engineered the meta data of IA’s extensive collection of Dead concerts (over 2,000 concert recordings).  Visit the Grateful Dead Archive Home at the Internet Archive here.

CouchDB Documents and Views

I stored the meta data into a local couchdb (running on Windows XP).  The basic document I am storing is a master detail set for the ‘best’ recording for each Dead concert.  The Master part of the document contains the date, venue and other data of the concert and the detail set is an array of meta data on each song preformed during the concert.  As is traditional with couchdb, the documents are represented as JSON strings.  Here is what the document for the UR recording (1965-11-01) found on the IA:

{

“_id”: “1965-11-01″,tumblr_ld4jfoNw7F1qai6ym

“_rev”: “1-6ea272d20d7fc80e51c1ba53a5101ac1″,

“mx”: false,

“pubdate”: “2009-03-14″,

“sb”: true,

“venue”: “various”,

“tracks”: [

{

"uri": "http://www.archive.org/download/gd1965-11- 01.sbd.bershaw.5417.sbeok.shnf/Acid4_01_vbr.mp3",

"track": "01",

"title": "Speed Limit",

"time": "09:48"

},

{

"uri": "http://www.archive.org/download/gd1965-11-01.sbd.bershaw.5417.sbeok.shnf/Acid4_02_vbr.mp3",

"track": "02",

"title": "Neil Cassidy Raps",

"time": "02:19"

}

]

}

Couchdb allow the creation of views which are binary trees with user defined Keys and user defined sub sets of the document data.  If one wanted to return the venue and the tracks for each concert for a given Month and Day (across all years) the view created in couchdb would look like:

“MonthDay”: {

“map”: “function(doc){emit(doc._id.substr(5,2)+doc._id.substr(8,2),[doc.venue , doc.IAKey, doc.tracks ])}”

}

This view allows us to use and HTTP GET to pass in a monthday key (e.g. “1101”) and get back (as a JSON array)

the date (MMDDYY: doc._id.substr(5,2)+doc._id.substr(8,2))

the venue (doc.venue);

the AI URI of the concert (doc.IAKey); and

an array of track data (doc.tracks)

MVC URL Routing Maps

Although we could call couchdb directly from the browser, we normally work through a gateway system for security, so we will build a shim to sit between the browser and couchdb.  This allows us to flow the authentication / authorization stack separately from couchdb’s security system.  In MS MVC we can create a new HTTP endpoint for AJAX calls (our shim) is a very simple manner. Let’s create an endpoint which will look like:

http:\\{our server path}\DeadBase\MonthDay\{month}\{day}

where vacuum_routing

http:\\{our server path}\DeadBase\MonthDay\111

would request month:11 and day:01 concerts.  In MVC we can declare this routing as:

routes.MapRoute(

“MyMonthDay”,

“{controller}/{action}/{month}/{day}”, 

new { controller = “DeadBase”, action = “RestMonthDay”,null} );

Done.  Interestingly in MVC 3 this route definition will accept either the form:

http:\\{our server path}\DeadBase\MonthDay\{month}\{day} ; or

http:\\{our server path}\DeadBase\MonthDay?month=”??”&day=”??”

In the second form,  parameter order does not matter, but case does; quotation marks are optional and need to be dealt with internally by the action method.

either of these call will resolve to the same controller and method.

MVC Controller and Method HandlerMVC

We now need to create the shim which will be the target for the Http Endpoint.  In C# this looks like:

public class DeadBaseController : Controller

public string RestMonthDay( string month, string day )
{
//our shim code goes here

      }

    }

We able to use string as our return type because we will be calling couchdb which returns a string from of JSON by default.  As a side note if we wanted to use MVC 3 to return JSON from a native C# object our controller method takes a different form:

public JsonResult GetStateList()

{

List<ListItem> list = new List<ListItem>() {

new ListItem() { Value = “1″, Text = “VA” },

new ListItem() { Value = “2″, Text = “MD” },

new ListItem() { Value = “3″, Text = “DC” } };

return this.Json(list);

}

Our AJAX call from the browser does not need to know any of these details.  Here is one way to code the call in JavaScript using JQuery:

var url = urlBase + “?” + args;ajax

$.ajax({

url: url,

dataType: ‘json’,

success: okCallBack,

error: nookCallBack

});

function okCallBack(data) {

gdData = data;

//do something useful here

}

function nookCallBack(xhr, ajaxOptions, errorThrown) {

alert(“ErrorText:” + errorThrown + ” ” + “Error Code:” + xhr.status);

}

}

From Handler to CouchDB in C#

Here is the rest of the generic C# code to go from the Handler to CouchDB and back.

Clean the parameters and pass the call to a generic couchDB GET caller:mvc

image

Format the view name and parameter into couchdb format  and pass to the low level couchDB caller:

image

Classic Framework HTTP code to make the HTTP GET and return the results as a string back up the call stack:

image

We could (and did) take our Browser code from the Ruby on Rails project above and with minimum changes call our MVC shim.

Simple clean and fun.

Occupy your mind2

QCon San Francisco 2011   1 comment

QCON the software conference hosted by InfoQ will be meeting in San Francisco next month form November 14 through November 16.  I attended last years conference and am looking forward to attending again this year.

I am also a regular attendee of the Microsoft MIX Conferences.  Even though both of these conferences focus on Web development the contrast between these two conferences could not be greater.  First off MIX is much larger and is of course devoted to all things Microsoft.  MVC 3 was the big push this year at MIX.  This is a very strong development approach for Microsoft doing its ‘embrace and extend’ dance which it does so well.  In this case source is Ruby on Rails and its approach to standard MVC development.  MVC 3 (and Microsoft) approaches the Web from the perspective of the corporate developer of (basically) client server architecture.  But it is not a bad or evil effort.  Indeed the improved and streamlined http pipeline used by IIS for MVC is fast, the tools development environment are well thought out and, once you drop down a level the low level support for Rest(ful) approaches, JSON and HTML templates is impressive.  In addition to JSON and JQUERY, Microsoft is also a strong supporter of the emergent ODATA standard.  I recommend MIX (and the Channel 9 videos of the conference) to anyone working with or considering Microsoft development tools. I always learn new things are gain important information on how to advance the web at MIX.  You can read more details on the sessions here.

In terms of big metal companies MVC 3 and Framework 4.0 are much stronger than anything Java EE has to offer. The biggest problem Microsoft has is that it can not seem to ship its HTML5 compatible browser and so it’s development systems do not optimize for (or even in some cases take advantage of) the strongest and newest features of HTML5.  In addition, try as they will two things Microsoft will never be is cutting edge or free.  Over in the LAMP and Rails  and NOSQL world QCon is offers a look at how the world of the web will be (or at least could be) if any of the independent developers who make up most of QCON’s speakers and attendee’s are able to hit the mark with the next big thing.  It’s always a mixed bag of nuts at QCon, a nice mixture of visionaries and hucksters, Rastafarians and Agile advocates.  I like this conference because in addition to providing me with some alternative voices to the Google and Microsoft and Oracle, it also forces me to both re-evaluate the way I am doing things and to think independently about HOW we can do web development.  And San Francisco is a much better venue than La$ Wage$.  This is a hacker fest without the emphasis on cool technique not how to create the next big thing (product or Brand).  This is NOT Web 2.0 Summit which is about venture capitalism defining the web.  Alexia Tsotsis will not be covering this.

By the way, if you are not reading InfoQ on the web regularly you ARE missing out.

REST, Ruby On Rails, CouchDB and Me – Part 5 Getting The Data Ready for CouchDB   2 comments

Part 0 – REST, Ruby On Rails, CouchDB and Me

Part 1 – Ruby, The Command Line Version

Part 2 – Aptana IDE For Ruby

Part 3 CouchDB Up and Running on Windows

Part 4 – CouchDB, Curl and RUBY

Part 5 – Getting The Data Ready for CouchDB

Part 6 – Getting The Data Into And Out Of CouchDB

Part 7 – JQUERY,JPlayer and HTML5

The Internet Archive and the Grateful Dead

The Internet Archive (IA)  is a 501( c ) 3 non-profit corporation dedicated to disseminating of public domain digital artifacts.  This includes such items as books, videos of imagesall kinds (TV Shows, shorts and feature films) audio recordings of all kinds (musical and spoken word).    One of their most popular projects is a truly huge collection of Grateful Dead concert recordings.  One 418910of the most visited pages of the Internet Archive’s web site is the “Grateful Dead Shows on This Day In History” page, which lists and allows play back of any Grateful Dead concerts they have in their collection of whatever day you visit the site.  Did I say that there IA has a large number of Grateful Dead concert recordings?   The are recordings of around 2,000 separate concert dates.  Any given concert may be represented by multiple recordings from a variety of sources: soundboard, audience recordings using professional and amateur equipment.  Original media ranges from cassette tapes, to 7 inch reel to reel and digital media.  If you are going to be working with the IA Grateful Dead collection please review the FAQ on the collections policy notes as well as the special notes here.

IA uses a very sophisticated data repository of meta data and an advanced query engine to allow retrieving both the meta data and the recordings.  Meta data can be retrieved directly using the “advanced search” engine.  On the day I started this post I visited IA and used the “Grateful Dead Shows on This Day In History”  The query returned data on 8  concerts (and 25 recordings of those 8 concerts).  A partial page image is given below:

image

Clicking on any of these entries moves us to a second screen in order to play the concert recording.  A screen shot of the playback screen looks like this:

image

Looking closer at the second screen we see the music player:

image

Can we design a faster, simpler and better looking interface into the Grateful Dead Archive?  Can couchDB help us? gratefuldead_20070108135140 The the first question will be addressed in a later post. This current post will look at how couchDB can  help us achieve a faster more efficient information system.  IA does a super job of serving up the music files on demand – there is no reason to duplicate their storage system.   However, IA is fairly slow to serve up meta data (such as the results of the “Grateful Dead Shows on This Day In History” query) Abstracting the IA metadata into a CouchDB database will allow us to serve up the meta data much faster than the IA query system.

Getting Data Into CouchDB

Our basic plan for using RUBY to get data from IA and into couchdb consists of:

  1. Prepare a URL query request to get the basic recording meta data (not the track meta data);
  2. Submit A GET request to IA using the URL query;
  3. Parse the XML returned to get at the individual Concert meta data fields;ruby
  4. Select the BEST recording for any given concert (more on this below);
  5. Prepare a URL to request track listing XML file based on the IA Primary Key of the selected concert recording;
  6. Submit a GET request to IA;
  7. Parse the XML returned to get at the individual track meta data fields;
  8. Create a ruby object which can be safely serialized to JSON;
  9. Serialize the object to a JSON string (i.e. a JSON document);
  10. Do a POST request to insert a JSON document for each concert into the couchdb database;
  11. Create couchDB views to allow optimized data retrieval; and
  12. Create a couchDB view to optimize retrieval recordings for all years for an arbitrary Month and Day (this duplicates the data provided by the “Grateful Dead Shows on This Day In History” selection in the Internet Archive.

Note we are not accessing nor storing the actual music files.  Before discussing how this plays out in practice lets define our JSON couchDB document.  We will cover items one through eight in this post.  We turn to items nine through twelve in the next post.

CouchDB Document Schema

CouchDB databases start with documents as the basic unit.  Typically a couchdb based application will have one database holding one or more variant document types.  There will be one or more design documents which provide multiple views, show functions and map functions as necessary to facilitate the application.  We will use a single document which will represent a abstract of the meta data contained in IA for individual recordings ( we are going to select the one ‘best’ recording per concert).  Our couchdb database will hold one document per concert.   The tracks (actually the track meta data will be stored as arrays within the  concert document).  We will populate the couchdb database in a single background session  pulling meta data (NOT THE MUSIC FILES) from IA and we will  include the IA publication date in the document so we can update our database when (if) new recordings are added to IA in the Grateful Dead collection.

Here are the document fields  which we will use:

Field

Notes

Typical Values

_id couchdb primary key.  We will use a natural key: a string representation of the concert date. 1969-07-04
_rev revision number provided by couchDB 1-6ea272d20d7fc80e51c1ba53a5101ac1
IAKey Internet Archive Key for this Recording gd1965-11-01.sbd.bershaw.5417.sbeok.shnf
pubdate Internet Archive Date When Recording was published to the web 2009-03-14
venue Wherethe concert took place Fillmore East Ballroom
description free text describing the concert – provided by the uploader Neal Cassady & The Warlocks 1965 1. Speed Limit studio recording/Prankster production tape circa late 1965
cm boolean – Recording by Charlie MIller  – used to select the ‘best’ recording true or false
sb boolean – Recording was made from a soundboard – used to select the ‘best’ recording true or false
mx boolean – A matrix style recording – used to select the ‘best’ recording true or false
tracks an array of meta data for each track of the recording see below

Each track in the tracks array  formally looks like:

Field Notes Typical value
IAKey The Internet Archive key for this track.  This key is unique within a given recording  (see the IAKey above) gd1965-11-01.sbd.bershaw.5417.sbeok.shnf/Acid4_01_vbr
track track number 02
title the song title Cold Rain and Snow
time the length of the track in minutes and seconds 09:48

Let call everything except the tracks our BASE data and the track data our TRACK data.

We insert documents to the database (using an HTTP post) as JSON so a typical document would look like this in JSON format:

{
“_id”: “1966-07-30″,
“IAKey”: “gd1966-07-30.sbd.GEMS.94631.flac16″,
“pubdate”: “2008-09-22″,
“venue”: “P.N.E. Garden Auditorium”,
“description”: “Set 1 Standing On The Corner I Know You Rider Next Time You See Me”,
“cm”: false,
“sb”: true,
“mx”: false,
“tracks”: [
{
"IAKey": "gd1966-07-30.sbd.GEMS.94631.flac16/gd1966-07-30.d1t01_vbr",
"track": "01",
"title": "Standing On The Corner",
"time": "03:46"
},
{
"IAKey": "gd1966-07-30.sbd.GEMS.94631.flac16/gd1966-07-30.d1t02_vbr",
"track": "02",
"title": "I Know You Rider",
"time": "03:18"
},
{
"IAKey": "gd1966-07-30.sbd.GEMS.94631.flac16/gd1966-07-30.d1t03_vbr",
"track": "03",
"title": "Next Time You See Me",
"time": "04:00"
}
]
}

2654190796_c0a810ec44

Hacking The Internet Archive: Getting Data From IA and Into CouchDB:

Here is the URL to obtain the page for “Grateful Dead Shows on This Day In History”:

http://www.archive.org/search.php?query=collection:GratefulDead%20date:19??-08-04&sort=-/metadata/date

This is a simple GET request with a query for the IA “collection” of Grateful Dead items filtered on the date string: 19??-08-04 and sorted descending by the concert date.  This get returns an HTML page.  This type of interface is known as an HTTP RPC interface.  RPC (Remote Procedure Call) interfaces are not pure REST interfaces but they are somewhat RESTful inthat they allow us to make a data request using a late bound, loosely coupled HTTP call.  See here and here for more theoretic background on RPC calls.  IA provides an  “Advanced Search” function will allow us to return data for an arbitrarily complex query in one of several different data formats other than HTML.  We selected XML as the format  for our work here.  XML is the traditional format for HTTP RPC but other formats may be better for certain applications.  Unfortunaely IA does not directly document the format of the RPC data request but they do provide a QEB page to build the request.  The page looks like this:

image

Using this screen we can compose a HTTP RPC request which will mimic the URL produced by “Grateful Dead Shows on This Day In History” and with a little brain effort and experimentation we can  understand how to compose requests without using the QBE screen.  By feeding the RPC request query back into advanced search and selecting XML as an output format as shown here:

clip_image001

we produce both an example of the HTTP RPC request which will return our desired data in our desired format.  Thus we generate a HTMLEncoded RPC request like:

@uri=”http://www.archive.org/advancedsearch.php?q=collection%3AGratefulDead+date%3A#{_dateString}&fl%5B%5D=avg_rating&fl%5B%5D=date&fl%5B%5D=description&fl%5B%5D=downloads&fl%5B%5D=format&fl%5B%5D=identifier&fl%5B%5D=publicdate&fl%5B%5D=subject&fl%5B%5D=title&sort%5B%5D=date+desc&sort%5B%5D=&sort%5B%5D=&rows=2000&page=1&callback=callback&output=xml”

where we replace #{_dateString} with a date string like 19??-08-08.  Of course to one years worth of data we could use a data string like: 1968-??-??.  It is a simple extension of the query languge to replace the singular date request: date%3A#{_dateString} with a date range.

which returns Grateful Dead recording data for all years of the last century which were recorded on 08-08.  The XML output returned to the caller looks like:

clip_image001[4]

In a more graphic format the output looks like:

clip_image001[6]

Within Ruby we will need to make the HTTP Get request with a desired date range, transform the body of the return request into an XML document and use XPATH to parse the XML and retrieve the meta data values for each recording (see below).  The is NOTHING inherently wrong with this RPC interface.  It is flexible and allows us to select only the data fields we are interested in and return data only for the dates we wish.  Since RUBY supports neither native consumption of JSON nor XML. So the XML format of the data is as good as any other and numerous tools exist in RUBY to manipulate XML data.  I which RUBY had a more native interface for JSON but it does not.

At this point, we do not have meta-data about individual tracks in a given recording.  It turns out that we can get this data but not through an HTTP RPC request.  It turns our, dear reader, that if we have the IAKey for the recording we can obtain an xml file with track meta data by making the following call:

http://www.archive.org/download/{IAKEY}/{IAKEY}_files.xml.

This file contains assorted XML data, it varies by what formats IA makes available the individual tracks via a 309 (HTTP redirect).  This is not an RPC call so we are far from a RESTful interface here.  We do not have control over the fields or range of the data included in this call.  It is all or nothing.  But at least the XML format is simple to mainipulate.  With the IAKey in hand for an individual recording and making some reasonable guesses we can parse the XML file of track data and compose the TRACKS array for our couchDB document using XPATH. A single entry for the high bit rate mp3 track recording looks like:

<file name=”gd89-08-04d2t01_vbr.mp3″ source=”derivative”>
<creator>Grateful Dead</creator>
<title>Tuning</title>
<album>1989-08-04 – Cal Expo Amphitheatre</album>
<track>13</track>
<bitrate>194</bitrate>
<length>00:32</length>
<format>VBR MP3</format>
<original>gd89-08-04d2t01.flac</original>
<md5>91723df9ad8926180b855a0557fd9871</md5>
<mtime>1210562971</mtime>
<size>794943</size>
<crc32>2fb41312</crc32>
<sha1>80a1a78f909fedf2221fb281a11f89a250717e1d</sha1>
</file>

Note that we have the IAKey for the track (gd89-08-04d2t01 ) as part of the name attribute.

Garcia

Using a background Ruby Process to Read the Data

The following RUBY GEMS are required to complete this step:

rest-open-uri : This GEM extends open-uri to support POST, PUT and DELTE HTTP command

json : This GEM handles serialization and de-serialization of a limited subset of RUBY into JSON strings.

From the standard RUBY library we will also be using

rexml : This GEM creates XML documents from XML Strings and supports XPATH which we will use to read the XML documents from IA

Our first step is to extract the get the the data via HTTP and parse the XML file returned to find individual recordings.  There are  (in most cases) be multiple recordings per concert (per date) and we want to retain for the database only the “best”.

In pseudo Ruby code:

require ‘rest-open-uri’

require ‘rexml/document’

 def initialize(_dateString)

#HTTP GET, create a string of the response body and transform the string into an XML node tree
#mind the screen wrap and html Encoding:
@uri=http://www.archive.org/advancedsearch.php?q=collection%3AGratefulDead+date%3A#{_dateString}&fl%5B%5D=avg_rating&fl%5B%5D=date&fl%5B%5D=description&fl%5B%5D=downloads&fl%5B%5D=format&fl%5B%5D=identifier&fl%5B%5D=publicdate&fl%5B%5D=subject&fl%5B%5D=title&sort%5B%5D=date+desc&sort%5B%5D=&sort%5B%5D=&rows=2000&page=1&callback=callback&output=xml

  xmlString=”
open (@uri) do |x|       #build a representation of the response body as a string
x.each_line do |y|
xmlString=xmlString+y
end
if xmlString==”
puts ‘No String Returned From Internet Archive’
quit
end
end
@IAXMLDocument= REXML::Document.new(xmlString)  #turn the string into an XML document
end #open

Now we need  to loop through the XML document and pull out each ‘doc’ section using XPATH and read each doc section for the meta data for that recording.

#use XPATH and find each response/result/doc node and yield

def get_recordings(document)

document.elements.each(‘response/result/doc’)do |doc|

yield doc
end
end

#get the XML document and yield

def get_record(xmldoc)

get_recordings(xmldoc) do |doc|
yield doc
end
end

#general purpose XPATH method to extract element.text (the metadata values) for arbitrary XPATH expressions

def extract_ElmText(doc,xpath)

doc.elements.each(xpath) { |element|  return element.text }
end

def worker(xmldoc)

#main loop

_docCount=0

get_recordings(xmldoc) do |doc|
_docCount+=1
_date=extract_ElmText(doc,’date[@name="date"]‘)[0..9]
_description=extract_ElmText(doc,”str[@name='description']“)
_title=extract_ElmText(doc,”str[@name='title']“)
_title=clean_Title(_title)
_keylist=pull_keys(doc)
_pubdate=extract_ElmText(doc,’date[@name="publicdate"]‘)[0..9]  #there is a bug here , corrected by lines below

if (_pubdate.length==0)
_pubdate=’1999-01-01′
puts “#No Publication Date: {_date} #{_title}”
end
_uri=extract_ElmText(doc,’str[@name="identifier"]‘)

#make a RUBY class object to hold one recording
_record=GDRecording.new _date, _description, _tracklist, _title, _keylist, _pubdate,_uri

#save the recording class objects in an array

@list[@list.count]=_record

end

In this code the ‘worker’ method calls the helper methods to:

0) Do the HTTP  get to make the RPC request and read the response body one line at a time and

1) transform the lines into a single string and convert (REXML::Document.new) the string into an XML document for processing by XPATH

2) loop through the doc nodes of the xml tree and extract the values of the  meta data fields

3) the meta data values are passed to an RUBY class ( GDRecording) which holds this meta data for later processing,

4 finally we temporarily store the recordings in an array for the next processing step.

Note that these routines work  whether the query returns a single day (with multiple recordings) or multiple days or even the whole dataset!  What is essencial is that we process the file as N ‘doc’ sub trees (which represent single recordings) and have recording date (at least) to group our data and extract the ‘best’ recording within each date group.

Our next step will be group the recordings by day (i.e. concert) and provide our own filter to select a single ‘best’ recording for each concert.

Shake and Bake:  Finding A ‘Best’ Recording.

best

What is the best Grateful Dead concert.  Why the first one I went to of course.  Just ask any Deadhead and you will probably get the same answer.  But what is the best recording of any given GD concert? My approach is very simple.

  • Most recent posted recordings are better than older recordings. (least important criteria)
  • Soundboard recordings are better than audience recordings.
  • Matrix recordings are even better.
  • Recordings mixed by Charlie Miller are best of all. (most important criteria)

Well these are MY criteria.  What ever criteria as long as they are hieratical  you can code the select in a very trivial manner.  If we have a field in each recording for the concert date and a field for each selection criteria (we derive these from the keywords field in IA) we sort the recordings by date and then by each of the criteria from most important (Charlie Miller in may case) to least important (date posted) and then select the first recording in sort order within each date group. On Ruby the sort of our list of recordings is trivial to code and easy to maniuplate (add new criteria or change the priority of criteria). The sort statement looks like this:

@list.sort! { |a,b| (a.date+a.cm.to_s+a.sb.to_s+a.mx.to_s+a.pubdate ) <=> (b.date+b.cm.to_s+b.sb.to_s+b.mx.to_s+b.pubdate )   }

Once sorted we create a list of best recordings as:

def newSelect
_dateGroup=nil
_list=Array.new
if  @list==nil or @list.count==0
puts ‘No Recordings.’
return;
end
foreach do |rec|
if _dateGroup!=rec.date
if dateGroup!=nil
@selectList[@selectList.count]=list[0]
end
_dateGroup=rec.date
_list=Array.new
end
_list[_list.count]=rec
end
if dateGroup!=nil
@selectList[@selectList.count]=list[0]
end
end

Note that is code is not only simple but it is independent of the selection criteria we are using.

Now that we have a list of recordings we are interested in,  we can get the XML file of track meta data using the IAKey discussed above and making a simple GET call and parsing the XML file for the meta data for each.  Much of the code used duplicates the XML code presented above so we need not reproduce all the code except to show a short section which uses a slightly different XML XPATH syntax:

open (filesURI) do |x| x.each_line do |y| xmlString=xmlString+y end end
REXML::Document.new(xmlString).elements.each(‘files’) do |doc|

doc.elements.each(‘file’) {|file_elm|
obj=nil
title=nil
trackString=nil
lengthString=nil
obj=file_elm.attributes["name"]
file_elm.elements.each(‘title’) { |element| title=element.text }
file_elm.elements.each(‘track’) { |element| trackString=element.text}
file_elm.elements.each(‘length’) { |element| lengthString=element.text}

{omitted code}

end

Okay now we have a (hash) list of recording meta data,  each item of which contains a (hash) list of track meta data for that recording.  In our next post we will leave this unRestful world behind and move into the RESTful world of couchDB when we:

  • Serialize the object to a JSON string (i.e. a JSON document);
  • Do POST requests to insert  a JSON document for each concert into the couchdb database;
  • Create couchDB views to allow optimized data retrieval; and
  • Create a couchDB view to optimize retrieval recordings for all years for an arbitrary Month and Day (this duplicates the data provided by the “Grateful Dead Shows on This Day In History” selection in the Internet Archive.

cat on fancy couch

REST, Ruby On Rails, CouchDB and Me – Part 4 – CURL on Windows And Ruby POST   Leave a comment

Part 0 – REST, Ruby On Rails, CouchDB and Me

Part 1 – Ruby, The Command Line Version

Part 2 – Aptana IDE For Ruby

Part 3 CouchDB Up and Running on Windows

Part 4 – CouchDB, Curl and RUBY

Part 5 – Getting The Data Ready for CouchDB

Part 6 – Getting The Data Into And Out Of CouchDB

Part 7 – JQUERY,JPlayer and HTML5

In The Post:

  • CURL and Couchdb
  • Documents Design and Otherwise
  • Posting Documents to couchDB Using Ruby

If you are like me you have spent some time with the free ebook: CouchDB The Definitive Guide.  If you are a windows user you may have run into some problems with the examples given in the chapter  on “Design Documents”.  Specifically they don’t work ‘out of the box’.  The examples in that chapter show us how to: create a database, to create and post a design document and to post a document to the database.  These examples use  CURL in a command shell.

OmniVortex

Since we are running Windows first we need to install CURL on our system.  Either set your system path to include the CURL executable. We can get a windows version here.  Use the version labeled DOS, Win32- MSVC or Win64 depending on your system. We assume here that couchDB has been installed successfully on your system. Now open a ‘command prompt’ on your system.  If you must have a UNIX type shell you need to install CYWIN or some other UNIX emulator for Windows.  If you are using the Aptana IDE like me you need to create an “external command” to open a command shell within Aptana.  This figure illustrates the setup within the Aptana IDE to do this:

image

In the command shell you can create a couchdb Database using a POST command and CURL.  Couchdb is RESTful so we use a PUT command for all actions which CREATE a resource, of which a database is one example.  The format of the command is:

curl -X PUT http://{couchdb}/{yourdatabasename}I want to create a database named deadbase so on my system this command and response looks like:

C:\Documents and Settings\dredfield\My Documents\Aptana Studio Workspace\couchDB01

>curl -X PUT http://127.0.0.1:5984/deadbase

{“ok”:true}

The where “{“ok”:true}” is the response body of the http response to my put command.  Confirm your work by starting a browser and navigating to Futon user interface to your couchdb installation.  On my system this url is:

http://127.0.0.1:5984/_utils/index.html

you should see something like this:

image

CURL and Documents

OK, now lets make a design document for this database and PUT that document to the new database.  With slight modifications to the example given in CouchDB The Definitive Guide my first cut at a design document looks like this:

{

     “_id” : “_design/example”,

     “views” : {

        “View00″ : {

       “map” : “function(doc){emit(doc._id, doc.date)}”

        }

  }

}

This is a JSON formatted document.  Initial syntax checking is up to you.  Basically couchDB will accept anything within the outer brackets whether or not it is formatted as usable JSON or not.  We have several options for checking syntax.  There are free online syntax checkers like JSONLint.  The interface to JSONLint looks like:

clip_image001

An installable open source JSON checker and visualizing tool, JSON View is available here.  JSON View’s output looks like:

clip_image001[10]

Now that we know our syntax is correct (if not the logic of the design document – more on this in the next installment) we can PUT this document to our database.  We can have more than one design document in a given database.  The name (id) of this document is “_design/example”.  where “_design” tells couchdb this is indeed a design document and its name is “example”.   My document is named mydesign.json on my file system.  The CURL command to PUT this into the database looks like:

curl -X PUT http://127.0.0.1:5984/deadbase/_design/example -d @mydesign.json

couchdb will respond:

{“ok”:true,”id”:”_design/example”,”rev”:”1-45f081a3f681b28ce7a0bdc5db216e74″}

Note here that this is NOT the syntax shown in CouchDB The Definitive Guide.  The syntax there will not work in a windows shell (i.e. command prompt).  Even when you have syntax correct JSON document  and the correct format of the PUT statement on Windows you may recieve an error message from CURL complaining about UTF8 errors within the document and have a PUT failure.  The problem here is that the Windows file system supports several encoding schemes and various windows programs save documents in to different default encoding.  If you are using Notepad.exe to create your files be sure to save the files in ANSIformat.

 
Check your work using the FUTON interface locate the “_design/example document” in deadbase

clip_image001[12]

Double click on the document:

clip_image001[16]

Note that “views” is a “Field” within the document.  Select the “Source” tab  and take a look inside the document:

clip_image002[4]

Now lets POST a document into the database.  Since we have not defined any validation fields we can push anything into the database.  Even documents which consist of just “{}”.  CouchDB defines only one innate restriction:

If a document defines the id field (“_id”) then the value of _id must not conflict with an existing value of the ID field of ANY other document in the database.

If the document does not define an ID field, couchDB will generate an ID (as a UUID) and apply it to the document.  You can supply your own ID values.  If you can either generate your own value  (Ruby can generate a GUID for you) or you can request a GUID from couchdb with a GET command.  See this page for more information.  In the sample program I will be developing for this series I will be using a ‘natural key’ – that is a key whose value has an actual meaning (a Social Security is such a natural key for example, but please never use this).  If you try to POST a document and use a duplicate key you will get back a 409 status code for the error.

The document I will be using in the next post looks like this:

{

“_id” : “1972-07-22″,

“IAKey” : “gd1972-07-22.sbd.miller.94112.sbeok.flac16″,

“description” : “Set 1 Bertha Me And My Uncle You Win Again Jack Straw Bird Song Beat It On Down The Line Sugaree Black Throated …

“pubdate”: “2008-08-15″,

“sb”: true,

“cm”: true,

“mx”: false,

“venue”: “Paramount Northwest Theatre”,

}

If I save this document as ConcertRecord.json I can use CURL to POST this document as:

curl -H “Content-Type: application/json” -X POST http://127.0.0.1:5984/deadbase/ -d @ConcertRecord.json

and couchdb will reply with an HTTP status 200 and a response body of:

{“ok”:true,”id”:”1972-07-22″,”rev”:”1-01a182f329c40ba3bab4b13695d0a098″}

In couchDB Futon this document looks like:

clip_image001[20]

Note that the order of the fields is set by couchDB not the order in the first loaded document.

Ruby At Last

OK, enough of the command shell let’s do some couchDB work using RUBY.  I am going to access couchDB from a fairly low level within Ruby in these posts.  There are several ActiveRecord type GEMS which will interface with couchDB but my focus here will be on: (1)  speed of access and (2) transferability of knowledge between Ruby access and direct Javascript/Browser access to couchDB.

Here’s a minimum of what we need to POST a document to a couchdb using RUBY.

The GEMS for

JSON : This will always load the Ruby based version of the JSON module.  If you want to have ‘pure’ JSON (i.e. a C based module you will need to have the Ruby/Windows DEVKit installed on your system.  For our purposes the ‘pure’ version is not necessary.

REST-OPEN-URI:  This extends open-uri by using the net/http  and the uri GEMs to cover all of the REST verbs (GET, POST, PUT and DELETE).  This is a very light install and is only lightly documented.

Here is the basic plan:

Assume we have a RUBY object (call it “rec”) which includes, among other things the fields we want to POST into the deadbase as a deadbase document like the one developed above.  We first need to convert the fields into a JSON string and then to POST the JSON string into the deadbase.  The JSON GEM is used to achive the first goal and REST-Open-URI is used to accomplish the second.

JSON Strings:

The JSON GEM will only serialize Ruby base types (strings, numbers and bools and HASH objects).  The JSON GEM is quite limited in that it will not serialize a Ruby object derived from the base RUBY object  into a JSON string, even if that object consists only of base types and Hash objects.  Although you may extend JSON we did not choose to do so. Rather we will create a simple Hash object and populate it manually via Ruby code with the fields we want to use for a document. Simply this could look like:

def makeJSON(rec)

thing=Hash.new()  #we know that JSON can serialize this type of object

thing["_id"]=rec.date

thing["IAKey"]=rec.uri

thing["description"]=rec.description

thing["venue"]=rec.title

thing["pubdate"]=rec.pubdate

thing["cm"]=rec.cm

thing["sb"]=rec.sb

thing["mx"]=rec.mx

return JSON.generate(thing)  #this returns a JSON String

end

REST-OPEN_URI:

Our POST routine will use the output form makeJSON and POST the JSON string to the deadbase.  In simple for this routine looks like:

def PostRecording(jsonString)

uri=”http://127.0.0.1:5984/deadbase/“   #this is our database

begin

responseBody=open(uri,:method=> :post, :body => jsonString,”Content-Type” => “application/json”).read

puts ‘POST Response Success: ‘ + responseBody

end

rescue

OpenURI::HTTPError => the_error

puts ‘Post Response Error: ‘ + the_error.io.status[0]

end

end

The key line is, of course:

responseBody=open(uri,:method=> :post, :body => jsonString,”Content-Type” => “application/json”).read

If we ran this line as:

responseBody=open(uri,:method=> :post, :body => jsonString).read

we would get an http Status code for an “Invalid Media Type”.  That’s because the default “Content-Type” for POST commands is “application/xxx-form” which is the typical format of a HTML “form” involved in a POST from a web browser.  We are far from a browser here and our “Content-Type” needs to be “application/json”.  The way to add Headers to the POST is to provide one or more key/value pairs with the desired header information.  Hence:

“Content-Type” => “application/json”

and the correct Ruby line is:

responseBody=open(uri,:method=> :post, :body => jsonString,”Content-Type” => “application/json”).read

We need to wrap the POST command in an exception block where the line:

OpenURI::HTTPError => the_error

is only executed IF the Http response status is > 399.  You can then do more fine grained responses to the error condition.  Specifically, if the_error.io.status[0]==409 you have attempted to POST the same document twice (at least two documents with the same ID).

That looks like a wrap for now.

5901067736_08fe849334_z

Posted 2011/07/22 by Cloud2013 in Aptana, couchdb, REST, Ruby

Tagged with , , , ,

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: