Gately’s A Premier Internet Shopping Site A Case Study Using Sawmill July 2006 Summary Gately’s is a chain of web stores, selling affordable products withhigh-quality, service and integrity. Their primary objective was to obtain numerical data on the return on investment of each of their internet advertisements, especially their Google Adwordsadvertisements, to determine which ads were costing more thanthey brought in. A secondary objective was to create a system which could report which items were ordered through the website, and how often, and how much was spent on them. Implementation The core of the implementation was a Sawmill log format plug-in that was based on the standard Apache plug-in included in Sawmill. In addition to tracking all web log fields, this plug-in also tracked order information, conversion information, and numerical fieldssuch as dollars spent. Ideally, every paid advertisement would point to a URL that containsthe search engine and search phrase of the advertisement. In this case, Sawmill can be configured to extract search engine and searchphrase information from the referrer field, but it is not as reliable asquery-based extraction; in particular, it is generally not effective forcontent-based AdWords. Nevertheless, a reasonable percentage ofsearch phrases were extracted from referrers, using the standardsearch engines rules built in to Sawmill, supplemented with customrules implemented through log filters. Gately’s eventually implemented search engine and search phrase information to everyadvertisement, at which point the referrer-based rules were made secondary, so query-based information was used when available, with referrer-based information as a fallback. On the Gately’s websites, each incoming visitor was assigned aunique OrderID, which appeared in the log data as a cookie. A SQLdatabase stored information on each order in an Orders table, including the OrderID items in the order (items purchased andquantity of each), tax, shipping, and customer information. Items were referred to by ItemID in the order record, and a separate Itemstable in the SQL database contained information about each item, including an English description of the item, the vendor, and theprice. The Orders table and Items table existed prior to the use of Sawmill, and nothing in the SQL database had to be changed to useit with Sawmill. it with Sawmill. To integrate the SQL database with Sawmill's web log analysis, Gately’s created a script which exported the contents of the Orders and Items tables in Sawmill's CFG (configuration group) format, avery simple, highly compact, highly legible, textual format. Once data is in a CFG file, it typically requires only one line in a log filterfor Sawmill to access the file. This logic was added to the log format plug-in, to extract the item information, order information, and numerical values (dollars spent) from the Orders and Items file, and to add them as standard database fields, and standard reports, in the web log analysis. To track conversions, it was necessary to associate the searchengines and search phrases of the original hit (the one whichresulted from the advertisement click) with the final order. This was done by keeping the search engine and search phraseinformation internally in a node in Sawmill, using the built-in Salanglanguage of Sawmill, and recovering the search engine and searchphrase information at the time the order hit occurred in the web logdata. This was very simple, due to Salang's natural support for hierarchical hash-based nodes, which can be persistent acrossmultiple invocations; for instance, if the advertisement hit occurs onone day, and the order occurs on a later day, the information willstill be carried across properly. Result The resulting reports showed all of the standard web log analysisreports (date/time reports, source IPs and countries/regions/cities, search engines, search phrases, pages and directories, and muchmore), but integrated the order information naturally, through new numerical columns and new standard reports. Each report includeda "purchase amount" column that indicates the dollar value ofpurchases associated with that column. This provided not onlysuch useful information as sales by country and sales by domain (each store is in its own domain, so this is sales by store), but alsothe primary objective: the standard Search Engines, Search Phrases, and Search Phrases by Search Engine reports included in the columnalso, so under Google Adwords, each search phrase appeared as a separate row, with the total sales dollars generated by the visitorsreferred by that keyword. referred by that keyword. Secondarily, the profile included custom Orders and Items reports, which showed the sales of each item, in quantity and dollars. The reports were configured to zoom naturally; so the Domains reportshowed the sales by store, and zooming in on a particular storeshowed the orders for that store, and zooming in on an ordershowed the items for the order. Sawmill's unlimited zooming capabilities allowed any combination of filters or zooms to beapplied, so for instance, it is possible to zoom on a particular daterange, and then zoom on a particular store, and then zoom on aparticular country, and finally click Items to see the items sold(including dollar amounts) for that date range, in the store, frombuyers in that country. This easily fulfilled the order analysisrequirement, and provided new ways of examining the data, whichhad not been in the original proposal. Finally, the analysis was a full-featured web log analysis. Even ignoring the order information, the reports provided all theinformation of a web traffic analysis, from unique IPs to sessions(entry/exit pages, paths through the site, session durations, repeatvisitors, and more), to visitor demographics (geographic, domains, IP addresses, web browsers and operating systems), to contentinformation (which pages and directories were hit and which filetypes), to technical information (bandwidth usage, server responsetimes, screen dimensions, server response codes, broken links, andother detailed metrics). Cost The software license cost under $1000, and the customizationwork, which was contracted to Flowerfire by Gately’s, took less than40 hours. The total project cost was under $10,000, and was completed in under six weeks. Work began in mid-November with a goal of having numbers for the holiday season, and the goal wasachieved. It should be noted that though Flowerfire did the work ona professional services basis, none of the work required new development on Sawmill--everything in the project was done usingthe existing Sawmill infrastructure, and could have been done byany expert Sawmill user without assistance from Flowerfire. Quote from Gately’s Pete Scudamore, who represented Gately’s and was the primarypoint of contact at Gately’s through the entire project, said, "I spenta long time looking for a solution which could provide actual returnon investment for our pay-per-click advertisements, by connecting our existing SQL data with the web logs. I spoke with manycompanies that seemed, at first glance, to be offering this type ofanalysis. But when it came down to it, none of them had theflexibility to do what we needed; the solution simply was notavailable through any product other than Sawmill. Sawmill was the only solution I found which could provide the actual sales generatedby each advertisement, from the information we had available. Flowerfire worked with me through the whole process, andcompleted the project on time and under budget."