When should I use Apache Druid?

I started having some good side discussions about Druid and the most common question was “when should I use Druid?”. The good news is the Druid documentation under the
Latest Design answers this question directly:

Druid is likely a good choice if your use case fits a few of the following descriptors:

  • Insert rates are very high, but updates are less common.
  • Most of your queries are aggregation and reporting queries (“group by” queries). You may also have searching and scanning queries.
  • You are targeting query latencies of 100ms to a few seconds.
  • Your data has a time component (Druid includes optimizations and design choices specifically related to time).
  • You may have more than one table, but each query hits just one big distributed table. Queries may potentially hit more than one smaller “lookup” table.
  • You have high cardinality data columns (e.g. URLs, user IDs) and need fast counting and ranking over them.
  • You want to load data from Kafka, HDFS, flat files, or object storage like Amazon S3.

Obviously event based data works very well with Druid, this is why I believe orders are a really good match for this. Because you can tie three critical pieces together for each order: SKU, Customer data, and Shipping, it becomes very easy to execute all kinds of queries tieing these data points together.

While I am somewhat stuck on eCommerce, here is a list of other companies that also use Druid for very different use cases (link).  Here are a few of my favorites:

Airbnb – Druid powers slice and dice analytics on both historical and realtime-time metrics. It significantly reduces latency of analytic queries and help people to get insights more interactively.

eBay – eBay uses Druid to aggregate multiple data streams for real-time user behavior analytics by ingesting up at a very high rate(over 100,000 events/sec), with the ability to query or aggregate data by any random combination of dimensions, and support over 100 concurrent queries without impacting ingest rate and query latencies.

Hulu – At Hulu, we use Druid to power our analytics platform that enables us to interactively deep dive into the behaviors of our users and applications in real-time.

Monetate – Druid is a critical component in Monetate’s personalization platform, where it acts as the serving layer of a lambda architecture. As such, Druid powers numerous real-time dashboards that provide marketers valuable insights into campaign performance and customer behavior

Nielsen – Nielsen Marketing Cloud uses Druid as it’s core real-time analytics tool to help its clients monitor, test and improve its audience targeting capabilities. With Druid, Nielsen provides its clients with in-depth consumer insights leveraging world-class Nielsen audience data.

The original list is pretty large, it is fairly safe to say Druid has a place in many markets!

Advertisements

Product Recommendations made easy with Apache Druid Part 1

I have been playing with Apache Druid for a bit now and I have to say I am very impressed with this package. Druid provides fast analytical queries, at high concurrency, on event-driven data. Druid can instantaneously ingest streaming data and provide sub-second queries to power interactive UIs.-link. Apache Druid essentially does all of the bulk lifting of segmenting the data and putting it into high performing indexes for super fast queries . You can stream the data directly into Druid using API’s or Apache Kafka, or you can simply upload massive amounts of data at intervals appending or replacing.

Because Druid does so much for you, you could actually run different campaigns using completely different data sources that are stored and indexed in Druid. Imagine running a campaign for “Hottest Items Last Fall” or “Seasons top sellers”. This would produce a product shelf similar to this on your eCommerce site:

Screen Shot 2019-08-07 at 10.17.20 AM.png

Those products could have been returned by Druid in real time, sorting the resulting SKU’s by order value, quantity sold and even filtered for things like shopper attributes (age, gender, location).

Screen Shot 2019-08-07 at 1.40.12 PM.png

Druid let’s you store as many data sources as you want, so you could actually build dynamic components in CoreMedia that can run the same campaigns on different data sources. This could be used for different brands and their SKU’s or even seasonal order data.

Screen Shot 2019-08-07 at 10.20.22 AM

For my use case, this means you could essentially push order line item data into Druid and get fast queries for product shelves like “Top Sellers“, “Top Weekend Sales“, or even “This weeks hits” – all based on the order line sales and the time and date stamp of the order.

Pushing this line item level order information should be trivial for most order management systems. I started to ask myself what data would I actually need to satisfy a few use cases. So I started writing some use cases down as one liners:

  • Most products sold
  • Total sales
  • Highest Total count sold on day of week
  • Highest Total count sold in month of year
  • Highest Total sales on day of week
  • Highest Total sales in week of year
  • Region top seller
  • Men top seller
  • Women top seller in region

I then had to figure out the minimum amount of data needed to be able to do those use cases and this is what I came up with:

“time”, “order_id”,”shopper_id”,”sku”,”price”,”quantity”,”cost”, “shipping_info”

That is all pretty standard information you can get from a PO. What is not part of that is the customer demographic information.  Because Druid performs best with flat data we will most likely have to write a routine that combines order line data with customer attribute data. We could include fields like these (if they are known):

“age”, “region”,“gender”:

This would allow us to ask Druid many different queries and get the proper response. In the CoreMedia extension model this should really be a returned list of SKU’s that we can map to the current product catalog. Some error handling or SKU replacement code might be needed; especially if you are running against year old data. Hopefully for more current campaigns like “Hottest Weekend Products” or “What’s hot this month” the data and SKUs very up to date. The resulting JSON sent in for each row would look like this:

{
"time":"2019-06-30 03:53:35",
"order_id":"id_055300006130",
"age":"40",
"region":"Midwest",
"gender":"M",
"shopper_id":"U_09080785",
"sku":"PC_CHEF_CORP_MP_KNIFE_SIGNAL_RED_SKU",
"price":79.0,
"quantity":3,
"cost":237.0
}

Sending in each order line item separately will allow Druid to actually dynamically build orders, return SKU’s based on any time and date combination, bloom filters, numeric expression, and of course grouping (total sales for a single SKU)- link.

I created a dataset with six months of order data, broken out by each line item as described above. It ended up being 431,148 line items created for 4,323 SKU’s in 300,000 orders

I went ahead and created queries for each of those use cases and I find Druid is extremely fast (more on that in Part 2), even when running on my local machine. Check out the slide show below for the various ways you can use SQL (or JSON) to query Druid. The real power comes with the way Druid can quickly return rows and run on functions like TIME_EXTRACT. Each query essentially returns a list of SKU’s ordered descending from either a total sales count or an items sold count.

This slideshow requires JavaScript.

Stay tuned for part 2 where I show how easy these kinds of dynamic product shelves based on sales and shopper data can be integrated into CoreMedia Studio. I will also show a demonstration where Apache Druid is accessed in realtime from our Studio where the maketing person can easily preview this dynamic behavior. A little teaser showing how the authoring environment (Preview CAE) and the runtime environment could access the same Druid data, giving marketers the same products as the shoppers would see.

 

I am really interested in hearing your thoughts on this, send me an email or leave a comment!

Screen Shot 2019-08-07 at 3.45.25 PM.png

CoreMedia Demo Jam 3 – Time travel is possible!

Scheduling content and delivering campaigns can be difficult, especially if it requires multiple teams to deliver the content or even worse, the dreaded IT involvement. Personalized content based on shopping behavior, demographics, or simply by date and time can be difficult if you can’t preview the site hitting the various rules. This is where CoreMedia Studio enables you to see your site as different personas or even at a different date and time with ease . Some people say time travel is impossible, Ancient CoreMedia Architects say otherwise…

In this Demo Jam I show how easy it is to set up rules for content to show on specific days, just one of the ways you can run your campaigns and keep your site fresh each day.

If you didn’t get the reference to Ancient Aliens, make sure you watch it on the History Channel Friday evenings.

DemoJam: MS Word, Products, and Articles – oh my!

In this three minute demo jam I show how simple it is to go from Microsoft Word to an online article. This demonstration hits on loading an article from Word, cleaning out the Microsoft HTML and producing optimized HTML of the article, preserves the fonts and the pictures from the Word document, we modify the crop of the image and preview our article teaser across all channels including social media.

Why are brands adopting a headless eCommerce approach?

In this short interview I ask Drew Lau, VP of Product from Mobify, why more and more brands are adopting a headless approach for eCommerce and content. Take the pain out of managing and coding your front-end and use Mobify’s front-end-as-a-service platform to make your digital experience iconic!

Mobify’s Front-end as a Service unlocks the agility of a headless commerce approach while powering fast, immersive experiences with PWAs, AMP, and native apps. – link

In this interview Drew answers these relevant questions:

  • Why are more and more retailers and brands adopting a headless commerce approach?
  • What’s the value in taking a headless approach to content management?
  • What are the challenges associated with going headless?
  • What are the different options for building a front-end for a headless environment?

 

Related articles:

Screen Shot 2019-06-18 at 10.57.07 AM.png

 

HOW DOES SALESFORCE PAGE DESIGNER WORK WITH A CMS?

 

Want to see how eCommerce business users can completely control the digital experience?

Then you need to come to the CoreMedia booth (#20) at Salesforce Connections in Chicago!  Red Hots, Wrigley Field and real Salesforce insights. That’s summer in Chicago with Salesforce Connections, taking place Jun 17-19, 2019 at McCormick Place West. I will be at the booth and would love to give you a personal demonstration of our CoreMedia Content Cloud.

VISIT US AT BOOTH #20 (CLICK HERE)

 

Let’s plan to connect so I can show you how to bring your product stories to life across a seamless customer journey with our Content Cloud platform.

Just to give you a taste for how great CoreMedia is with Salesforce, take a look at some material I posted on LinkedIn.

Screen Shot 2019-06-05 at 12.03.57 PM

If you don’t have or use LinkedIn then you can take a look at my Made Easy with CoreMedia video series on YouTube where I have a collection of videos showing how easy digital experience can be with a first class brand management system.

Webinar: Maximize your IBM Commerce Investment

Are you ready to hear how you can move forward with IBM Commerce? Do you have questions about your options for maximizing your investment? If so, join Brent Murray from CoreMedia and Rick Miller from Zilker in this 30 minute webinar and hear how your brand can be iconic with IBM Commerce, CoreMedia and Zilker!

Seize the Opportunity to Maximize your IBM Commerce Investment

Thur Jun 20, 2019 • 1-1:30 pm Eastern

Investing in IBM’s eCommerce platform was smart. But the recent divestiture of WebSphere Commerce to HCL likely raises some questions. What can you do to keep your WebSphere Commerce investment on track and moving forward?

Join CoreMedia and Zilker for this exclusive 30-minute webinar and we’ll walk you through everything you need to know to transform your customer experiences with advanced brand management and turn disruption into opportunity.

Register Today!

 

 

Microsoft Word to online article in a snap

Screen Shot 2019-05-30 at 12.52.42 PM.pngI don’t think the genie in Aladdin can do something this cool so quickly. You might be thinking, well… I have been able to save Word documents to HTML since 2000 and my answer would be “yes, but can you save it with streamlined HTML into a content management system and retain the rich text and images?”. 

One of the frustrating pieces about the HTML Microsoft Word generates is that it’s filled with all kinds of nonsense tags and CSS, it’s barely legible. Case in point is this is the HTML for the article referenced in the video that Microsoft Word generates:

<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
line-height:normal;background:#FCFCFC'><b style='mso-bidi-font-weight:normal'><span
style='font-size:10.5pt;font-family:"VarelaRegular",serif;mso-fareast-font-family:
"Times New Roman";mso-bidi-font-family:"Times New Roman";color:#3E3D3D;
mso-ansi-language:EN-US;mso-fareast-language:DE'>In Speed Flying and Speed
Riding, everything happens VERY fast.<o:p></o:p></span></b></p>

<p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;
line-height:normal;background:#FCFCFC'><span style='font-size:10.5pt;
font-family:"VarelaRegular",serif;mso-fareast-font-family:"Times New Roman";
mso-bidi-font-family:"Times New Roman";color:#3E3D3D;mso-ansi-language:EN-US;
mso-fareast-language:DE'>It makes sense to build a solid foundation of glider
control skills in a slower, more forgiving aircraft before you move to the big
time. Even a single day lesson will teach you valuable lessons about our
incredible aircraft and you will experience the magic of flight for the first
time. To get started just visit one of the speed riding schools in Chamonix and
go fast!</span></p>

Not pretty!

That same document imported into CoreMedia has this around the same block:

Screen Shot 2019-05-30 at 3.58.29 PM.png

The other aspect is just getting the document into your CMS with all of the images properly uploaded and referenced in the article can also be a challenge.

In this video, watch how easy it is to take an MS Word document and import it into CoreMedia Studio; preserving the rich text and images and using the CoreMedia Studio to fix the various crop renderings of the teasers across the social channels.

 

Mobify + SFCC is like peanut butter and jelly, but where is the bread?

Salesforce Commerce Cloud, or Demandware, never really provided the best content management nor a high-converting progressive web(PWA) storefront that is lightening fast. With Mobify you get the latest and greatest PWA technology with their front-end as a service business model. It is specifically designed for eCommerce platforms.

The Mobify Connector API will allow Salesforce ecommerce platform customers to quickly upgrade their mobile websites to high-converting Progressive Web Apps. – link

When I was a child, one of the best breads in the world was from my Grandfathers restaurant, Trinkaus Manor. They made their own bread every weekend and most weekends Mom brought home an entire loaf – it was always a treat and often spoke about to this day. I loved peanut butter and jelly sandwiches but my favorite peanut butter and jelly sandwiches were made with Trinkaus Manor bread!

But where is the bread that can hold together this Mobify + SFCC PBJ?

Then it hit me like a ton of bricks on the treadmill watching the video below – CoreMedia is the bread!  I wanted to learn about the Demo Jams they do over at Salesforce; they are very cool 3 minute demonstrations by different companies and they compete for the best demonstration given. Well, at about 22 minutes and 10 seconds I see the Mobify demo and they showed some iconic companies that use Mobify: Pandora, Payless, and Lancome.

Now, I have spoken about Pandora before and its pretty well known they are a CoreMedia customer. The bread in this case is CoreMedia Content Cloud! It looks like this super fast site is built on top of a headless Salesforce eCommerce platform, Mobify as their front-end and CoreMedia Content Cloud for their digital experience!

I will be participating at the Demo Jam at Connections in late June so I look forward to it!

Can you do this with your headless PWA?

If the content for your progressive web or headless application can’t be controlled by non-IT employees then you need to look at how you are doing business. The previous post I had about “Headless PLUS” showed why technologies like schema stitching are changing the way integrations work on the glass, this post focuses solely on the line of business user (marketer, merchandizer) and the content management expectations whether it is headless, or not.

I stole this from a presentation we give when we tell why CoreMedia is a game changer for shops with headless apps because its headless PLUS:

Preview

The first thing marketers miss in a headless world is a functioning preview of the experience before it goes out to customers. How does the new experience feel? Marketers have to be able to answer that question with certainty at any time before staging it. To do that, they have to be able to preview the experience in all its variants for all different touch points, devices, customer segments and contexts.

Therefore, CoreMedia Content Cloud empowers marketers with a real-time preview for all touch points including single-page apps built with React, Angular or Vue.

Time Travel

But how will a customer experience feel on Christmas Eve for someone in Stockholm who speaks French with an iPhone XS Max? Marketers need to be able to answer that question too in order to orchestrate complex experiences at scale. We call it Time Travel.

Marketers can jump to any point in the future and simulate the experience for any touchpoint and context they want.

Quick Edit

Previewing an experience in various current or future contexts is a great tool for marketers. It gets even better with the ability to quickly edit the content objects creating the experience right from the app preview. CoreMedia Content Cloud enables developers to add context menus to the app preview that empower marketers to quickly edit all content on the fly.

Multi-Language 

Speaking the language of your customer shows respect. It can be a huge challenge for marketers though. Therefore, CoreMedia empowers marketers to orchestrate any experience in as many languages as needed. The whole translation process is managed within CoreMedia Content Cloud with seamless integrations into established translation services. Marketers can preview any experience in any languages and use fall-back languages for any content objects that haven’t been translated yet.

Multi-Region

Not all content is relevant or even appropriate for all regions of the world. CoreMedia Content Cloud empowers marketers to orchestrate iconic global experiences but empower local marketers to make the necessary changes without breaking the overall experience.

Micro-Experiences

If creating or changing even the smallest micro-experience already requires a developer, marketing teams quickly run into frustration. CoreMedia empowers marketers by giving them the creative tools to build and maintain micro-experiences on the fly without the need for developers or designers. Developers can empower marketers even more by adding more sophisticated capabilities to micro-experience templates used by the marketers.

Heads (Optional)

While modern frontend frameworks are powerful and quickly evolving, not all marketing needs are best met with a headless setup. More often than not, there is the need to support multiple heads at once. Thats why CoreMedia built powerful blueprints for various use cases to jumpstart the development for customers plus an integrated Studio to orchestrate all these heads seamlessly.

The result of all this tooling for marketers is empowering. A marketing executive of ours shared his experience like this: “It is very empowering for our teams. Our marketers can create and edit attractive content with out the need to wait for designers or developers to do anything.”