Main report
Strategic Data Science Coordination
The Strategic Data Science Coordination section of the questionnaire aimed to assess the establishment of (or plans for) strategic data science coordination within NSOs and wider (such as their NSS).
1. Big data/data science projects established
Many NSOs provided qualitative information about the type of big data / data science projects that have been established at their organisation. Some of the more common projects involve alternative data sources, for example, web scraped data, mobile phone data and scanner data. More information on projects can be found in the following inventory: https://unstats.un.org/bigdata/inventory/.
Almost half of the respondents undertake big data or data science projects. 47% of NSOs currently undertake big data projects, 32% do not undertake any of those projects, but are trying to establish. Around 21% do not plan to undertake those projects at all.
2. Big data/data science strategy in place
28 NSOs indicate to have a big data/data science strategy in place, with 35% of respondents to the survey indicating that they have implemented such a strategy.
A third of respondents have a strategy for big data in place, almost two thirds (60%) of respondents do report to try to establish a big data strategy in their NSO. Only 5% of respondents do not have any strategy established.
3. Chief Data Officer/Data Science Lead available
20 NSOs indicate to have a designated Chief Data Officer/Data Science Lead in place, with 25% of the respondents to the survey confirming this post.
A quarter of the respondents to the survey have a designated Chief Data Officer. About 42% trying to establish this post, while about 30% do not plan do so.
4. Coordination challenges for NSOs
The central challenge for NSOs is collaborating with Big Data source owners outside the government (65% of respondents), followed by human resources (58% of respondents) and legislative issues (54% of respondents).
Privacy issues related to public trust and methodological aspects cause medium level challenges, with 44% of the respondents indicating that privacy issues are difficult and 48% of respondents pointing at methodological aspects.
Cooperation with big data source owners inside the NSS and information technology issues are seen as low-level challenges, with 24% of NSOs indicating that cooperation with data owners inside the NSO are low level challenges, and 21% of NSOs indicating that IT issues are low level challenges.
5. Big data partnerships established
Partnerships inside the NSS and with the government still dominate the field; however, there is high interest in partnerships with new data sources and providers.
Around 36% of NSOs have established partnerships with the NSS/government ministries. A quarter of NSOs engage with academic institutes and satellite or aerial image provider (ca. 26% respectively). Least importantly appear social media providers – less than 5% have established partnerships with them, and about 58% of respondents do not consider doing so. Partnerships with cloud server providers seem similarly unpopular. Importantly, about 58% of NSOs are trying to establish a partnership with mobile phone operators.
6. Negotiation capacity of NSOs
The majority of NSOs are able to negotiate data provision with their partners.
40 NSOs are able to negotiate data provision with their partners, with 51% indicating to be able to do so. 35% of NSOs trying to establish negotiation capacity and only 8% do not plan to do so.
7. Data ethics policy
Almost two thirds of NSOs have a data ethics policy in place.
50 NSOs indicate to have a data ethics policy in place, with 63% of NSOs confirming to do so, and 19% of all respondents trying to establish one. Only 14% of NSOs do not plan to implement an ethics policy.
8. National quality assurance framework
Over a third of respondents do have a national quality framework in place.
32 NSOs indicate to have a national quality framework in place, with 41% of respondents to the survey indicate to have done so, and 44% are trying to establish one. Only 14% of respondents indicate to not plan any to establish any quality framework.
Legal Framework
The Legal Framework section of the questionnaire aimed to assess the establishment of (or plans for) a legal framework for data access and data sharing within NSOs, their NSS, and potentially wider.
1. Legal framework with access to big data
44% of the responding NSOs have a legal framework that covers access to big data from other Government Departments and big data partnerships.
35 of NSOs indicate that there is a legal framework in place, with 44% confirming to have one and 39% trying to establish one. Only 10% of respondents do not consider a legal framework, and 6% do not know.
2. Legal framework for data access for academia
Over two thirds of NSOs have a legal framework in place that allows academia to safely access their data.
58 NSOs indicate to have a framework for data access for academia in place, with 73% of all respondents confirming to have such a framework, 15% aim to establish a framework, and only 9% do not aim to establish one.
3. Legal framework that penalizes data disclosures
The large majority have a legal framework in place that penalizes data disclosures.
69 respondents indicate to have a legal framework in place that penalizes data disclosures, with 91% of respondents confirming to have one. 3% of respondents are trying to establish one and only 1% of respondents does not plan to implement such a framework.
4. Data protection law
47 of the respondents have established an overarching data protection law (59%), 18% are trying to establish one. 11% of respondents are not considering the establishment of a data protection law.
Statistic | Yes | No, but trying to include this | No, not considered | Don’t know |
---|---|---|---|---|
Count | 33 | 2 | 30 | 11 |
Percent | 43% | 3% | 39% | 14% |
IT Infrastructure
The IT infrastructure section of the report outlines the extent of, or future plans for, IT infrastructure within NSOs. This section also assesses how the IT of NSOs enables big data analytics in a secure environment. The IT Infrastructure for many NSOs seems to be presenting more of a challenge for incorporating big data. The below graph depicts the responses to questions around onsite data storage capability, computing power and skills at the NSO. The results show that approximately:
42% of NSOs have the appropriate processes in place for secure import / export of the data;
52% have adequate (i.e. un-interrupted) power supply;
32% have the required skills for accessing the data. Hence, the challenges posed are around the lack of onsite storage and computing power onsite, plus the lack of required skills for accessing the data.
Other data collected shows that around 24% of NSOs have access to, and are using, offsite national data centres. Access to a secure data centre was not available to 48% of the NSOs surveyed.
25% of NSOs reported having secure cloud infrastructure. Secure cloud infrastructure is not being considered by 33% of the NSOs surveyed.
Human Resources
The Human Resources section of the survey asked questions on the number of data science posts and practitioners within each NSO. It also asked questions on skills gaps and the future plans for recruitment and growth. This included the skills needed to develop and maintain appropriate methodologies.
1. Staff numbers
Unsurprisingly, the size of NSOs varied. To categorise them the NSOs were grouped from small to very large. Those with a size less than 500 were categorised as small, between 500 and 2,499 as medium, between 2,500 and 5,000 as large, and, those with more than 5,000 were grouped as very large. There were 75 valid responses that could be used to group NSOs.
Size | Count |
---|---|
Small | 26 |
Medium | 37 |
Large | 6 |
Very large | 5 |
minimum | q1 | median | mean | q3 | maximum | na |
---|---|---|---|---|---|---|
1 | 334.5 | 700 | 1776.27 | 1886 | 22969 | 26 |
The survey also asked NSOs how many of their employees are applying Big Data / data science techniques. NSOs were asked:
The number of qualified “Data Scientists” at MSc or PhD level
The number of analysts who are applying Big Data / data science techniques
Others who are applying Big Data / data science techniques, such as IT professionals
Type | minimum | q1 | median | mean | q3 | maximum | na |
---|---|---|---|---|---|---|---|
Data Scientist | 0 | 0.00 | 2 | 10.39130 | 8 | 149 | 54 |
Analyst | 0 | 2.00 | 5 | 20.73913 | 20 | 250 | 54 |
Other | 0 | 0.25 | 3 | 16.02174 | 10 | 400 | 54 |
Before reading the following please consider the accuracy of responses from NSOs, the staff number given by NSOs may be estimates. It would not be unreasonable to hypothesize that the larger the NSO, the greater number of staff there is applying data science techniques. This appears true for the number of qualified at MSc/PhD level and the number of analysts. Evidence does not exist to suggest that the number of other roles applying techniques has a correlation with total staff number at NSOs. Evidence suggests a moderate relationship between size of NSO and the amount of qualified data scientists. It also suggests a weak relationship between size of NSO and analysts applying data science skills.
Type | estimate | statistic | p.value | method | alternative |
---|---|---|---|---|---|
Data Scientist | 0.4953287 | 7161.286 | 0.0006296 | Spearman’s rank correlation rho | two.sided |
Analyst | 0.3082279 | 9816.247 | 0.0417956 | Spearman’s rank correlation rho | two.sided |
Other | 0.1692327 | 11788.588 | 0.2721214 | Spearman’s rank correlation rho | two.sided |
2. External recruitment strategy
Only 33% of the NSOs surveyed reported having a strategy for recruiting external staff. A greater number (42%) are looking to establish a strategy. Reasons cited for difficulty in recruiting include having no coordinated strategy and/or one that is specific to hiring the technical experts needed for Big Data / data science work. There is a feeling that the lack of competitive benefits that can be offered provides NSOs with difficulties when actively looking to employ experts.
3. Internal upskilling strategy
A strategy to upskill current employees appears to be a higher priority for NSOs with 41% having a strategy for this with over half (52%) trying to establish one. Included within these strategies include taught and self taught training programmes ranging from introductory to Masters degree level, international collaborations, R and Python implementation, training roadmaps and curriculums, and also, networking groups. Generally, it appears as though NSOs hold themselves responsible for upskilling staff in this area. There is also the effort by some to partner with academia and other nations to provide the high class training that NSOs may struggle to provide internally.
4. Existing Big Data skills
Although many NSOs have already accessed training or are in the process of establishing it, there are still large gaps in big data skills that are apparent from the survey responses.
Of the NSOs that responded to direct questions about big data skills, the most established skills are identified as: Geographic Information Systems (51%) and Coding skills (47%); whilst the least developed skills (or most needed) are: Big Data Methodology (73%), Big Data Project Management (64%), and Mathematical / Statistical modelling (53%). Skills that were written following the ‘Other’ response included topics such as data engineering and a need for improved domain knowledge.
A point raised by one NSO and something to consider is that because of the rapid development of new tools there exists an ongoing need for training and improvements even in skills that are deemed established. Turnover of staff is another factor that contributes to a need for ongoing training.
5. Delivery of training
Big data / data science skills and access to training is of high interest to the Task Team, since it is also tasked with developing a Competency Framework against which training will be mapped. Big Data / data science training has been delivered by 47% of NSOs surveyed. 28 (35%) of the NSOs are trying to establish this training. The main focus of this delivered training appears to be on the usage of the programming languages R, Python and SQL. As well as looking to upskill coding skills, NSOs reported delivering training on reproducibility and machine learning.
6. Academic partnerships
Partnerships with academia can prove useful in supporting NSOs with Big Data work. These partnerships were reported to help not just on a project resource basis but also with providing NSOs with access to highly skilled technical experts and domain knowledge expertise. These partnerships have been built by less than half of the NSOs (46%). It appears as though the value of these partnerships is generally well considered with a further 38% trying to establish partnerships. Partnerships with international organisations, other NSOs and international universities were also mentioned. Collaborations with organisations such as Eurostat was mentioned and one NSO reported that their collaboration with an international University on a social media project was highly successful and resulted in national and international press coverage. This project was reported to have helped to accelerate Big Data adoption in the NSO.
7. Competency framework
The need for a competency framework is emphasised by the results of the survey. Only 8% of NSOs surveyed stated that they had a Big Data / data science competency framework. Over half (52%) of NSOs are looking to establish one.
8. Career pathway
The amount of NSOs with a data scientist career pathway was also reported to be low. The uptake was also 8% but over half (54%) were not considering trying to establish one compared to 30% that are. The pathways may exist but not specifically for data scientist positions.
9. Challenges to delivering training
NSOs face a number of challenges to delivering Big Data / data science training to their employees. The most frequently cited reason appears to be budget constraints limiting their ability to fund the necessary training.
Another problem is some NSOs have difficulty accessing or hiring technical experts and highly skilled trainers that can upskill their workforce. The need for some to source this training externally can incur higher costs.
One NSO provides an example that it is difficult to hire staff that tick both boxes of highly skilled computer science and also maths skills. This suggests that even highly skilled graduates in respective fields may require training offers in the field of Big Data and data science.
The spread of skills that are interpreted as being as necessary when working with Big Data / data science may present a challenge to NSOs in where to focus their training.
One NSO cited that a challenge for them was inaccessibility to the best practice of other countries. It could perhaps be considered that encouragement on the sharing of training materials and best practice training is something that may be an appropriate recommendation.
Guidance
The final section of the survey asked NSOs to indicate the level of urgency for guidance on big data topic areas. The below graph highlights responses, with the highest urgency identified as ‘Skills and Training’ (72%). Although in the previous section, it was reported that training had been delivered by 47% of respondents, there are clearly still gaps in the provision and access to training that requires further investigation. The need for guidance is also high for ‘Access and partnerships’ (59%), and ‘Quality frameworks’ (53%).