Removing the Tester Safety Net

approved.eps

Moving to Continuous Delivery and a Quality Focused Process

We’re all familiar with the waterfall approach of software development.  It keeps skill-sets in silos and, from a tester point of view, we were the ones squeezed for time when projects overran.

Adopting agile in the latest Membership Programme incarnation at the Financial Times many years ago started to make a change.  The concept of starting to break work into smaller pieces and working much closer to one unit as a team removed the big bang approach of these problems.  Ultimately they still existed.  Like most development teams our testers were outnumbered by developers, but ultimately had as much if not more to do.  The introduction of automated testing if anything made matters worse.  When you’re new to agile you can struggle to work out where to build automated tests into the process.  We agreed that they needed to be part of the sprint from day one, but this meant we still had split skill-sets – manual and automated testers.  Both were needed to get the work done. Continue reading “Removing the Tester Safety Net”

Splunk HTTP Event Collector: Direct pipe to Splunk

In August 2016 the FT switched from on-premises Splunk to Splunk Cloud (SaaS). Since then we have seen big improvements in the service:

  1. Searches are faster than ever before
  2. Uptime is near 100%
  3. New features and security updates are deployed frequently

One interesting new feature of Splunk Cloud is called HTTP Event Collector (HEC). HEC is an API that enables applications to send data directly to Splunk without having to rely on intermediate forwarder nodes. Token-based authentication and SSL encryption ensures that communication between peers is secure.

HEC supports raw and JSON formatted event payloads. Using JSON formatted payloads enables to batch multiple events into single JSON document which makes data delivery more efficient as multiple events can be delivered within a single HTTP request.

Time before HEC

Before I dive into technical details let’s look at what motivated us to start looking at HEC.

I’m a member of the Integration Engineering team and I’m currently embedded in Universal Publishing (UP) team. The problem that I was recently asked to investigate relates to log delivery to Splunk Cloud. Logs sent from UP clusters took several hours to appear in Splunk. This caused various issues with Splunk dashboards and alerts, and slowed down troubleshooting process as we didn’t have data instantly available in Splunk.

The following screenshot highlights the issue where event that was logged at 7:45am (see Real Timestamp) appears in Splunk 8 hours and 45 minutes later at 4:30pm (see Splunk Timestamp). Continue reading “Splunk HTTP Event Collector: Direct pipe to Splunk”