"An extraordinary thinker and strategist" "Great knowledge and a wealth of experience" "Informative and entertaining as always" "Captivating!" "Very relevant information" "10 out of 7 actually!" "In my over 20 years in the Analytics and Information Management space I believe Alan is the best and most complete practitioner I have worked with" "Surprisingly entertaining..." "Extremely eloquent, knowledgeable and great at joining the topics and themes between presentations" "Informative, dynamic and engaging" "I'd work with Alan even if I didn't enjoy it so much." "The quintessential information and data management practitioner – passionate, evangelistic, experienced, intelligent, and knowledgeable" "The best knowledgeable, enthusiastic and committed problem solver I have ever worked with" "His passion and depth of knowledge in Information Management Strategy and Governance is infectious" "Feed him your most critical strategic challenges. They are his breakfast." "A rare gem - a pleasure to work with."

Saturday, 31 May 2014

Information Management Quote of the Week 31/05/14

Skeptical scrutiny is the means, in both science and religion, by which deep thoughts can be winnowed from deep nonsense.

Carl Sagan

See also: https://www.youtube.com/watch?v=dADUBcoEEHw 

Thursday, 29 May 2014

The one question that you must never ask!

Information Requirements gathering – without asking about information requirements

Over the years, I’ve tended to find that asking any individual or group the question “What data/information do you want?” gets one of two responses:

“I don’t know.” Or;

“What do you mean by that?”

End of discussion, meeting over, pack up go home, nobody is any the wiser. Result? IT makes up the requirements based on what they think the business should want, the business gets all huffy because IT doesn’t understand what they need, and general disappointment and resentment ensues.

Clearly for Information Management, Business Intelligence and Analytics solutions, this is not a good thing.

So I’ve stopped asking the question. Instead, when doing requirements gathering for an information project, I go through a workshop process that follows the following outline agenda:

Context setting: Why information management / Business Intelligence / Analytics / Data Governance* is generally perceived to be a “good thing”. This is essentially a very quick prĂ©cis of the BI project mandate, and should aim at putting people at ease by answering the question “What exactly are we all doing here?”

(*Delete as appropriate).

Business Function & Process discovery: What do people do in their jobs – functions & tasks? If you can get them to explain why they do those things  - i.e. to what end purpose or outcome - so much the better (though this can be a stretch for many.)

Challenges: what problems or issues do they currently face in their endeavours? What prevents them from succeeding in their jobs? What would they do differently if they had the opportunity to do so?

Opportunities: What is currently good? Existing capabilities (systems, processes, resources) are in place that could be developed further or re-used/re-purposed to help achieve the desired outcomes?

Desired Actions: What should happen next?

As a consultant, I see it as part of my role to inject ideas into the workshop dialogue too, using a couple of question forms specifically designed to provoke a response:

“What would happen if…X”

“Have you thought about…Y”

“Why do you do/want…Z”.

Notice that as the workshop discussion proceeds, the participants will naturally start to explore aspects that relate to later parts of the agenda – this is entirely ok. The agenda is there to provide a framework for the discussion, not a constraint. We want people to open up and spill their guts, not clam up. (Although beware of the “rambler” who just won’t shut up but never gets to the point…)

Notice also that not once have we actively explored the “D” or “I” words. That’s because as you explore the agenda, any information requirements will either naturally fall out of the discussion as it proceed, or else you can infer the information requirements arising based on the other aspects of the discussion.

As the workshop attendees explore the different aspects of the session, you will find that the discussion will touch upon a number of different themes, which you can categorise and capture on-the-fly (I tend to do this on sheets of butchers paper tacked to the walls, so that the findings are shared and visible to all participants.). Comments will typically fall into the following broad categories:
  • Functions: Things that people do as part of doing business.
  • Stakeholders: people who are involved (including helpful people elsewhere in the organisation – follow up with them!)
  • Inhibitors: Things that currently prevent progress (these either become immediate scope-change items if they are show-stoppers for the current initiative, or else they form additional future project opportunities to raise with management)
  • Enablers: Resources to make use of (e.g. data sets that another team hold, which aren’t currently shared)
  • Constraints: non-negotiable” aspects that must be taken into account. (Note: I tend to find that all constraints are actually negotiable and can be overcome if there is enough desire, money and political will.)
  • Considerations: Things to be aware of that may have an influence somewhere along the line.
  • Source systems: places where data comes from
  • Information requirements: The outputs that people really want.
Here’s a (semi) fictitious example:

e.g. ADD: “What does your team do?”

Workshop Victim Participant #1: “Well, we’re trying to reconcile the customer account balances with the individual transactions.”

ADD: And why do you wan to do that?

Workshop Victim Participant #2: “We think there’s a discrepancy in the warehouse stock balances, compared with what’s been shipped to customers. The sales guys keep their own database of customer contracts and orders and Jim’s already given us dump of the data, while finance run the accounts receivables process. But Sally the Accounts Clerk doesn’t let the numbers out under any circumstances, so basically we’re screwed."

  • Functions: Sales Processing, Contract Management, Order Fulfilment, Stock Management, Accounts Receivable.
  • Stakeholders: Warehouse team, Sales team (Jim), Finance team.
  • Inhibitors: Finance don’t collaborate.
  • Enablers: Jim is helpful.
  • Source Systems: Stock System, Customer Database, Order Management, Finance System.
  • Information Requirements: Orders (Quantity & Price by Customer, by Salesman, by Stock Item), Dispatches (Quantity & Price by Customer, by Salesman, by Warehouse Clerk, by Stock Item), Financial Transactions (Value by Customer, by Order Ref)

You will also probably end up with the attendees identifying a number of immediate self-assigned actions arising from the discussion – good ideas that either haven’t occurred to them before or have sat on the “To-Do” list. (Note that all actions should have an agreed Due Date). That’s your workshop “value add” right there….

e.g. Workshop Victim Participant #1: “I could go and speak to the Financial Controller about getting access to the finance data. He’s more amenable to working together than Sally, who just does what she’s told.”
  • ACTION: WP#1: Discuss data access with Fin. Controller by end of week.

Happy information requirements gathering!


In response to a number of recent enquiries including this exchange on LinkedIn, I have prepared an example Data Specification and Information Requirements Framework, including a number of supporting templates. 

The links to download the Framework can be found on my Resources and Downloads page.

Saturday, 24 May 2014

Information Management Quote of the Week 23/04/14

“It is our attitude at the beginning of a difficult undertaking which, more than anything else, will determine its successful outcome.” 

(William James)

See also: https://www.youtube.com/watch?v=fLUBcOXI5pU

Thursday, 22 May 2014

Do you trust in the Dark Arts?

Why estimating Data Quality profiling isn't just guess-work

Data Management lore would have us believe that estimating the amount of work involved in Data Quality analysis is a bit of a “Dark Art,” and to get a close enough approximation for quoting purposes requires much scrying, haruspicy and wet-finger-waving, as well as plenty of general wailing and gnashing of teeth. (Those of you with a background in Project Management could probably argue that any type of work estimation is just as problematic, and that in any event work will expand to more than fill the time available…).

However, you may no longer need to call on the services of Severus Snape or Mystic Meg to get a workable estimate for data quality profiling. My colleague from QFire Software, Neil Currie, recently put me onto a post by David Loshin on SearchDataManagement.com, which proposes a more structured and rational approach to estimating data quality work effort.

At first glance, the overall methodology that David proposes is reasonable in terms of estimating effort for a pure profiling exercise - at least in principle. (It's analogous to similar "bottom/up" calculations that I've used in the past to estimate ETL development on a job-by-job basis, or creation of standards Business Intelligence reports on a report-by-report basis). 

I would observe that David’s approach is predicated on the (big and probably optimistic) assumption that we're only doing the profiling step. The follow-on stages of analysis, remediation and prevention are excluded – and in my experience, that's where the real work most often lies! There is also the assumption that a pre-existing checklist of assessment criteria exists – and developing the library of quality check criteria can be a significant exercise in its own right.

However, even accepting the "profiling only" principle, I’d also offer a couple of additional enhancements to the overall approach.

Firstly, even with profiling tools, the inspection and analysis process for any "wrong" elements can go a lot further than just a 10-minute-per-item-compare-with-the-checklist, particularly in data sets with a large number of records. Also, there's the question of root-cause diagnosis (And good DQ methods WILL go into inspecting the actual member records themselves). So for contra-indicated attributes, I'd suggest a slightly extended estimation model:

  • 10mins: for each "Simple" item (standard format, no applied business rules, fewer that 100 member records) 
  • 30 mins: for each "Medium" complexity item (unusual formats, some embedded business logic, data sets up to 1000 member records) 
  • 60 mins: for any "Hard" high-complexity items (significant, complex business logic, data sets over 1000 member records) 
Secondly, and more importantly - David doesn't really allow for the human factor. It's always people that are bloody hard work! While it's all very well to do a profiling exercise in-and-of-itself, the result need to be shared with human beings - presented, scrutinised, questioned, validated, evaluated, verified, justified. (Then acted upon, hopefully!) And even allowing for the set-aside of the "Analysis" stages onwards, then there will need to be some form of socialisation within the "Profiling" phase.That's not a technical exercise - it's about communication, collaboration and co-operation. Which means it may take an awful lot longer than just doing the tool-based profiling process!

How much socialisation? That depends on the number of stakeholders, and their nature. As a rule-of-thumb, I'd suggest the following: 

  • Two hours of preparation per workshop (If the stakeholder group is "tame". Double it if there are participants who are negatively inclined). 
  • One hour face-time per workshop (Double it for "negatives") 
  • One hour post-workshop write-up time per workshop 
  • One workshop per 10 stakeholders. 
  • Two days to prepare any final papers and recommendations, and present to the Steering Group/Project Board. 
That's in addition to David's formula for estimating the pure data profiling tasks. 

Detailed root-cause analysis (Validate), remediation (Protect) and ongoing evaluation (Monitor) stages are a whole other ball-game.

Alternatively, just stick with the crystal balls and goats - you might not even need to kill the goat anymore

Thursday, 15 May 2014

Higher Education Business Intelligence Conference 2014 - Top Takeaways

It was a privilege to be the Chair for two intensive days of presentations, discussions and debate at the Liquid Learning Higher Education Business Intelligence Conference this week.

As the conference proceeded, a number of recurring themes began to emerge and while the focus was on the issues as they pertain to the Higher Education sector, I would suggest the areas that we highlighted are also just as relevant to other industry sectors.

Thought-provoking contributions came from University of Melbourne, University of Sydney, University of Wollongong, University of Newcastle, University of Western Sydney, Australian National University, RMIT and Macquarie University, with input from external consultants Pranay Lodhiya of Tadishi Group and Robert Eames of Fivenines Consulting, as well as my own keynote contribution to open the conference.

The range of topics and issues that we touched upon was diverse – institutional strategy, human and behavioural factors, organisation and structural challenges as well as technical, architectural and planning aspects (plus the impacts arising as a result of the Australian budget announcement which came overnight between conference sessions!).

1. Moving from basic reporting to forward forecasting and predictive modelling
There was clear enthusiasm and acceptance that there needs to be a move towards more predictive scenario models for such areas as student retention and churn, courses of study evaluation, staff management and financial impact planning.

However, this was tempered with the recognition that many universities are still getting to grips with the basic foundations of good quality standard reporting, trusted sources of information and establishment of KPIs and metrics.

2. The emerging role of the Business Intelligence Competency Centres (BICC)
BICCs are becoming more prevalent (even if that’s not what they’re sometimes being called!) aimed at focussing the organisation’s skills and expertise in Business Intelligence, Data Management and Analytics and creating a critical mass of capability. We recognised that the BICC needs to go beyond the technical delivery of BI solutions and must start to act as a change agent to develop and coach capability for business narratives, analytical thinking skills and decision-making into the wider organisation. The hierarchical position of the BICC can vary depending on the culture and organisational dynamics of the organisation, as long as it is positioned within an area that provides sufficient political sponsorship and support.

We noted that high-quality analytic thinkers and skilled BI specialists are in short supply and there is the ever-present threat that good people will be poached into other business sectors.

3. Business Strategy and Data Strategy in concert:
Business Intelligence solutions and services need to be expressed explicitly within the context of the business’s strategic objectives. The BI team needs to draw clear linkage and alignment between informational outputs and the strategic KPIs that they help to support.

Ideally, clear information management and data governance principles will be explicitly expressed within the business strategy.

4. Agile methods are great – when applied appropriately
IT departments are embracing “Big A” Agile methods and Kanban practices for continuous improvement with increasing zeal and these approaches are excellent for application development and process management initiatives.

Be careful, however. Agile methods are not suitable approaches for delivering Enterprise Data Models and Information Architectures, which require a more strategic, structured and holistic approach from the outset. IT colleagues need to be educated to understand the difference to ensure that the most appropriate methods are applied.

5. Higher Education needs canonical data models
A canonical data model (a.k.a. enterprise information model, reference data model etc.) provides an underlying “blueprint” for the information requirements of the enterprise. This drives consistency, repeatability, inter-operability and integrity of the organisations data and also supports consolidation of business rules and metadata.

Unfortunately, there are few industry-standard data models for the Higher Education sector and the data collection requirements for government reporting are not sufficiently detailed or comprehensive enough to support business strategic and operational data management demands at the institutional level.

Having recognised the need for such common core data model, some institutions are now working on developing their own canonical models to suit their own internal needs. There may be an opportunity to collaborate more widely on establishing a common standard information model, at least at the conceptual and logical data model levels, though it is unclear what the “compelling event” might be to make this happen.

8. Business Intelligence is a contact sport
You’ve got to engage on a continuing and ongoing basis. Get out and meet people, listen, understand what they do and then make relevant offers to help them do it more effectively. Build relationships – and then build them again. Don’t come empty-handed – have something interesting in your back pocket and ready to share as a point of stimulus.

7. Democratise the data
To be of value, data needs to be in the hands of business users – engage a pilot of group of interested users as your evangelist community. Re-purposing existing data for new applications drives innovation and creativity.

This will sometimes mean (appropriate) use of “skunk” works to get things done, at least in the short term. Don’t be afraid to bypass the corporate data warehouse – but do it mindfully!

If new one-off and ad-hoc solutions become of repeatable value, then go back and apply the engineering rigour, resilience and ongoing support as a second pass.

8. Making progress is slow and can be frustrating. Take heart!
Effective change takes a long time in the Higher Education sector. Broad and in-depth consultation is expected for even the smallest change (everyone wants to have an opinion), high levels of unionisation inhibit responsive change, many people are task-oriented rather than outcomes-oriented and there is a general reluctance to make courageous or tough decisions.

(e.g. the issue of Academic Productivity is an “elephant in the room” but is one area where universities could make significant inroads into establishing long-term strategic budgetary health.)

That all said, change is possible. There are end-user champions who can engage with new initiatives to gain traction, and though it might take a bit longer, there is also a significant scope for a “just do it” approach, at least with initial prototype solutions (with after-the-fact consultation, socialisation and engagement). We need to be proactive change agents bringing innovative ideas to the university community, and it’s part of the BI to use data-driven evidence to act as facilitators, enablers and agents provocateurs. Keep challenging assumptions, keep asking the hard questions, keep pushing the boundaries.

And as Pranay put it: “If you’re not frustrated, then you’re not doing your job properly”!