A few months ago a fellow developer asked me how to use WAS in an application that requires synchronous document conversion. In the current post I show you a simple way for that.
As you might know (if not you can read a bit more about that at the end of this post), you can submit Word documents to WAS and let it convert the documents to PDF or other formats like XPS. WAS works as a timer job, so conversion is done based on the schedule of the job that you should set based on the number of documents to be converted and the free resources of the server. As in the case of any timer job, you can start the Word Automation Services Timer Job immediately using the web UI and from custom code as well.
For the sample method I pass in the document content as a byte array and the converted document is returned by the method as a byte array as well. First I’ve implemented a Stream-based solution but found it is easier to work with byte arrays in this case (see reason a bit later).
After preparing and starting the ConversionJob, we start the WAS timer job if immediate conversion is requested, then wait until the conversion is finished either successfully or unsuccessfully or until the timeout interval elapsed. In case of timeout, we cancel the conversion process. Next we display possible conversion errors and delete the documents from the working document library if requested.
- private byte[] ConvertDocument(SPWeb web, byte[] docToConvert, bool isImmediate,
- String conversionLibName, int timeOutSecs, bool deleteDocs)
- {
- byte[] result = null;
- SPList conversionLib = web.Lists[conversionLibName];
- SPFolder folder = conversionLib.RootFolder;
- // Get the default proxy for the current Word Automation Services instance
- SPServiceContext serviceContext = SPServiceContext.GetContext(web.Site);
- WordServiceApplicationProxy wordServiceApplicationProxy =
- (WordServiceApplicationProxy)serviceContext.GetDefaultProxy(typeof(WordServiceApplicationProxy));
- ConversionJob job = new ConversionJob(wordServiceApplicationProxy);
- job.UserToken = web.CurrentUser.UserToken;
- job.Settings.UpdateFields = true;
- job.Settings.OutputSaveBehavior = SaveBehavior.AlwaysOverwrite;
- job.Settings.OutputFormat = SaveFormat.PDF;
- String docFileName = Guid.NewGuid().ToString("D");
- // we replace possible existing files on upload
- // although there is a minimal chance for GUID duplicates
- SPFile docFile = folder.Files.Add(docFileName + ".docx", docToConvert, true);
- conversionLib.AddItem(docFileName + ".docx", SPFileSystemObjectType.File);
- String docFileUrl = String.Format("{0}/{1}", web.Url, docFile.Url);
- String pdfFileUrl = String.Format("{0}/{1}.pdf",
- web.Url, docFile.Url.Substring(0, docFile.Url.Length – 5));
- job.AddFile(docFileUrl, pdfFileUrl);
- // let's do the job
- // Start-SPTimerJob "Word Automation Services"
- job.Start();
- if (isImmediate)
- {
- StartServiceJob("Word Automation Services Timer Job");
- }
- ConversionJobStatus cjStatus = new ConversionJobStatus(wordServiceApplicationProxy, job.JobId, null);
- // set up timeout
- TimeSpan timeSpan = new TimeSpan(0, 0, timeOutSecs);
- DateTime conversionStarted = DateTime.Now;
- int finishedConversionCount = cjStatus.Succeeded + cjStatus.Failed;
- while ((finishedConversionCount != 1) && ((DateTime.Now – conversionStarted) < timeSpan))
- {
- // wait a sec.
- Thread.Sleep(1000);
- cjStatus = new ConversionJobStatus(wordServiceApplicationProxy, job.JobId, null);
- finishedConversionCount = cjStatus.Succeeded + cjStatus.Failed;
- }
- // timeouted -> cancel conversion
- if (finishedConversionCount != 1)
- {
- job.Cancel();
- }
- // we can output the possible failed conversion error(s)
- foreach (ConversionItemInfo cii in cjStatus.GetItems(ItemTypes.Failed))
- {
- Console.WriteLine("Failed conversion. Input file: '{0}'; Output file: '{1}'; Error code: '{2}'; Error message: '{3}';",
- cii.InputFile, cii.OutputFile, cii.ErrorCode, cii.ErrorMessage);
- }
- SPFile convertedFile = web.GetFile(pdfFileUrl);
- // shouldn't be null (unless there is a conversion error)
- // but we check for sure
- if ((convertedFile != null) && (convertedFile.Exists))
- {
- Stream pdfStream = convertedFile.OpenBinaryStream();
- result = new byte[pdfStream.Length];
- pdfStream.Read(result, 0, result.Length);
- // delete result doc if requested
- if (deleteDocs)
- {
- convertedFile.Delete();
- }
- }
- // delete source doc if requested
- if (deleteDocs)
- {
- docFile.Delete();
- }
- return result;
- }
- private void StartServiceJob(string serviceTypeName, string jobTypeName)
- {
- SPFarm.Local.Services.ToList().ForEach(
- svc => svc.JobDefinitions.ToList().ForEach(
- jd =>
- {
- if ((jd.TypeName == jobTypeName) && ((serviceTypeName == null) || (serviceTypeName == svc.TypeName)))
- {
- jd.RunNow();
- }
- }));
- }
To start immediate conversion in the ConvertDocument method I used a slightly modified version of the StartServiceJob method already introduced in my former post.
- private void StartServiceJob(string serviceTypeName, string jobTypeName)
- {
- SPFarm.Local.Services.ToList().ForEach(
- svc => svc.JobDefinitions.ToList().ForEach(
- jd =>
- {
- if ((jd.TypeName == jobTypeName) && ((serviceTypeName == null) || (serviceTypeName == svc.TypeName)))
- {
- jd.RunNow();
- }
- }));
- }
- private void StartServiceJob(string jobTypeName)
- {
- StartServiceJob(null, jobTypeName);
- }
The following code snippet shows a sample for calling the ConvertDocument method. In this case we request an immediate conversion with 240 seconds timeout and use the standard Shared Documents document library as a working folder, deleting the temporary files.
- DateTime startTime = DateTime.Now;
- byte[] doc = File.ReadAllBytes(@"C:\Data\HelloWorld.docx");
- byte[] pdf = ConvertDocument(web, doc, true, "Shared Documents", 240, true);
- if (pdf != null)
- {
- File.WriteAllBytes(@"C:\Data\HelloWorld.pdf", pdf);
- }
- Console.WriteLine("Duration of conversion: {0} ms", (DateTime.Now – startTime).TotalMilliseconds);
The sample above requires further work if you would like to use it in a real application. First, you should add some extra error handling, for example check if default WordServiceApplicationProxy is found at all, etc.
Next, instead of submitting documents one by one to WAS it is better to create a ConvertDocument version that supports multiple document conversion. In this case you should use arrays of byte arrays that I found easier than bothering (like disposing through using blocks) with multiple streams simultaneously.
You can extend the supported conversion options to other formats as well, like XPS.
In a real life application you probably wouldn’t like to start immediate conversions on each requests because it might produce a heavy load on your servers. Instead you can create a specific queue for documents with the option for high privilege users to submit dedicated document types for immediate conversions and leave the default conversion schedule for the others.
Although our original goal was to create a synchronous conversion method, sometimes it is more comfortable to do the conversion asynchronously, for example to avoid locking of the UI thread. To support that in your application, you can start ConvertDocument in a separate thread and raise your custom .NET events to reflect the output of the conversion job.