“Data scientists are expensive resources that often fail to produce results because they don’t get the technical support they need,” says Brett StClair, CEO of Teraflow, a data engineering firm with offices in London, Johannesburg and Cape Town.
According to StClair, data science is a multidisciplinary function for which the input of various experts is required, not just the data scientist. Starting from scratch Newly hired data scientists regularly find themselves entering a workplace with poor machine learning infrastructure or few resources to work with, and no appropriate management function to guide them. They may be embedded into the business intelligence (BI) team or analytics team with only a token amount of data with which to experiment. It’s not long before they realise this sample data and their personally selected ML tooling is hardly enough to solve even basic business problems. “To get the most from its investment, the management team has to change its approach to data science or it will continue to come up empty,” warns StClair. An engineering approach Organisations should think of the machine learning function as being similar to building a factory. Industrial engineers must first lay out production lines in the most efficient and cost-effective configuration possible before the factory manager can start manufacturing. In addition, they require a well-stocked raw materials warehouse with the right components with which to make their finished goods. In the same manner, data engineers and machine learning engineers need to develop the most efficient data pipelines and machine learning processes before data scientists can effectively analyse data and train algorithms. They also have to fill these with corrected data from a supply chain of data silos spread across the enterprise and the Internet. “New data scientists are basically being given an empty raw materials warehouse and a deserted factory floor,” says StClair. “Yet, they’re expected to create AI-based components from nothing.” The right people for the job The biggest mistake a company can make is to assign the responsibility for developing its machine learning infrastructure to its IT department, BI team or the data scientist themselves. It may be hard to believe that those dealing with data all day are the least qualified to construct a sound platform through which it can be transformed into production-ready AI. Yet, companies with winning AI implementations are those who recognise that a highly specialised skillset is required to make this critical element work correctly. “Factory managers don’t build factories; they produce finished goods,” says StClair. “They leave the factory building and raw material supply to the experts.” Success begins at the top Above all, organisations need a strong executive leader to architect their data science strategy and drive its growth, namely a Chief Data Officer or Chief Data Science Officer. “When it comes to board-level buy-in, most enterprises already have that,” says StClair. “What they often lack is a knowledgeable C-suite champion who knows what they absolutely must get right to reach their goals and overcome the inevitable obstacles along the way.” ENDS MEDIA CONTACT: Stephné du Toit, 084 587 9933, [email protected], www.atthatpoint.co.za For more information on Teraflow.ai please visit: Website: https://teraflow.ai/ LinkedIn: https://www.linkedin.com/company/teraflow/ Facebook: https://www.facebook.com/teraflowai-107896893957011/
0 Comments
Finance Minister Tito Mboweni is expected to deliver his 2020 Budget Speech on 26th February. No doubt, many South Africans will be interested in what the projected economic figures for the year will be. But could these be predicted using machine learning? “As an experiment, this is exactly what we attempted to do,” reports Kimoon Kim, Data Director at Teraflow, a data engineering firm with offices in Johannesburg, Cape Town and London.
The experiment With ten years of data, including GDP Growth Rate, Inflation Rate, Interest Rate, Unemployment Rate and the USD/ZAR Exchange Rate, Kim attempted to find relationships that would allow him to project this year’s budget figures. It turned out there were no correlations between them strong enough to make an accurate prediction. However, by scanning a wide range of global data sets, Kim was able to identify similar trends from completely unrelated sources. He found patterns very much like the GDP Growth Rate in US statistics on the number of people killed by hot drinks, food, fats and cooking oils. Similarly, he discovered a close resemblance between the number of people killed by hot water in the US and the SA Inflation Rate. Although these trends are completely unrelated in a causal sense, their highly correlated changes in value over time means one may be used to gauge the other. Kim’s prediction: a GDP Growth Rate of 0.6% and an Inflation Rate of 3.6% for 2020. “Of course, we’ll follow the Budget Speech intently to see just how close we are,” he says. The importance of correlation Apart from forecasting the country’s economic outlook, identifying important correlations, even between seemingly unrelated data, can have a huge impact on business decisions. For instance, one financing company in China gave its in-house AI system the task of assessing loan applications made through its mobile app. Whereas banks often employ around 10 measurements to rate an applicant’s credit worthiness, the AI appraises some 5000 personal attributes based on available data. For example, it considers how confidently a person types when applying or, more surprisingly, whether their phone battery is low during the process. It seems the AI identified a hidden correlation for a sense of responsibility, by detecting that people with poor repayment habits often did not ensure their battery was properly charged for this important financial transaction. It’s something few human loan officers would, or even could, contemplate when evaluating someone’s credit risk. This illustrates the power of leveraging correlations correctly. Operationalising data for correlation To identify significant correlations, organisations must operationalise their data by extracting it from siloed enterprise data sources, transforming it to an appropriate format for analysis, and collecting it into a centralised data repository where it can be easily accessed. “Only when data is brought together through a repeatable, systematic process can these relationships be effectively exposed and strategically exploited,” says Kim. Whether predicting the country’s economic trends, understanding behavioural patterns in consumers, or making previously unimagined business decisions, the critical role of correlation cannot be overstated. ENDS MEDIA CONTACT: Stephné du Toit, 084 587 9933, [email protected], www.atthatpoint.co.za For more information on Teraflow.ai please visit: Website: https://teraflow.ai/ LinkedIn: https://www.linkedin.com/company/teraflow/ Facebook: https://www.facebook.com/teraflowai-107896893957011/ Last month, Toshiba Corporation in Japan announced it had created an algorithm that can crunch data faster than the world’s fastest supercomputer, apparently using everyday desktop PC technologies.
“This news highlights the pivotal role algorithms play in machine learning and future business competitiveness,” says Brett StClair, CEO of Teraflow, a data engineering firm with offices in London, Johannesburg and Cape Town. What the algorithm does Toshiba’s Simulated Bifurcation Algorithm is a breakthrough in solving combinatorial optimisation. This involves a class of problems in which the addition of each new element to a set of items increases the possible combinations exponentially, until calculating the best combination becomes impossible using traditional computers. Real-world examples include selecting optimal logistic routes, reducing traffic congestion, selecting lowest cost/highest return financial investments, or designing molecules for drug development. It is this type of problem to which quantum computers will one day be applied, once they are sufficiently advanced. “The ability to now solve these problems using classical computers shows how the right algorithm can make all the difference,” says StClair. Why it’s important Many organisations still handle data science and machine learning in an inefficient and repetitive manner, instead of developing an automated and integrated approach to data preparation, the machine learning process, and the deployment of AI models to their live business operations. “When the data science team focuses strictly on the creation (or effective selection) and application of the right algorithms, it can solve business problems faster and more efficiently,” says StClair. “Every other consideration, like data cleaning, should be subordinated to that goal.” He refers to this as the “operationalisation of data and productionisation of AI”. StClair likens the approach to DevOps, where software developers focus on writing code and solving business problems, while their code is compiled, tested, integrated and deployed by a background system that is typically both centralised and automated. “This is how organisations can reap the greatest benefits and maximize the return on their machine learning investment,” he says. Algorithms in the Cloud Even though the Toshiba algorithm can be run on desktop machines, accelerated computation is achieved by parallelising computers. “Our algorithm thus highly resonates with the technology trends in parallel computing,” states the company’s website. This highlights another implication of data science. As their data grows, it is becoming increasingly difficult for organisations to develop automated decision-making under the computing and storage constraints of their corporate systems. “Many are turning to the Cloud, not only for unlimited compute and storage but also unlimited parallelised processing,” says StClair. “There is no way to become an AI-driven organisation without it.” Calculated competitiveness The competitiveness of future organisations will hinge not only on how well they employ AI within their business but the agility with which they can evolve their algorithm portfolio to meet new business challenges. “Companies cannot keep rolling their models manually on a corporate system without eventually running out of resources,” warns StClair. “Toshiba’s algorithm shows us what’s possible and that investment in a Cloud-based, systematized approach is worth it.” ENDS MEDIA CONTACT: Stephné du Toit, 084 587 9933, [email protected], www.atthatpoint.co.za For more information on Teraflow.ai please visit: Website: https://teraflow.ai/ LinkedIn: https://www.linkedin.com/company/teraflow/ Facebook: https://www.facebook.com/teraflowai-107896893957011/ SARS recently announced it had embarked on a journey of digital transformation that will see its operations “informed by data-driven insights, self-learning computers, artificial intelligence and interconnectivity of people and devices”. “When a national institute like SARS voices a commitment to digital transformation, it’s a clear indicator of the profound effect the Fourth Industrial Revolution is having on trade, industry and government,” says Michael Cowen, Transformation Director at Teraflow, a data engineering firm with offices in Johannesburg, Cape Town and London. Yet, industry analysts, like McKinsey, report that 70% of digital transformation projects fail. So what should the country’s tax authority - and every other organisation - do to ensure its own efforts succeed? According to Cowen, successful digital transformation projects get several key steps right from the start. They operationalise data In its 2018/2019 Annual Report (page 60), SARS states that it has already cleaned its data and implemented generic analytics. It has now entered the next stage of advanced and predictive analytics. While clean data is critical, organisations must also ensure their data is fully operationalised. Operationalised data is not only clean but continually and automatically extracted from enterprise-wide data sources, and formatted and stored in a centralised Cloud-based repository where it is readily available for use in data analytics and machine learning. “By building a continuous, independent process, organisations ensure the data for their analytics and AI systems is always fresh and reflects the latest trends in their business, without manual intervention,” says Cowen. “This will accelerate their innovation programmes.” They digitise and automate workflows To add value, data needs to flow efficiently through an enterprise or across its supply chain, creating a seamless link from customer requests through to final delivery and strategic review. Workflows define how transactions are passed from one business activity to another, whether manual or computerised. Modelling this digitized representation of the way work is performed is at the heart of digital transformation. “Failing projects attempt to replicate human input, keystroke by keystroke,” reports Cowen. “However, digital organisations minimize manual processing and pursue highly automated workflows that exploit integration between systems, and standardise problem analysis and resolution.” They embrace a digital mindset Finally, successful digital transformation requires that executive stakeholders attain an understanding of what data is, and how it can be utilised to propel innovation initiatives and make systems intelligent. This can be very difficult for those accustomed to a traditional style of business management and may require intense change management to correct. With a solid grasp of digital concepts, like Cloud computing, mobile apps, or IoT, they can reimagine the way their products and services are delivered to an online market that already embraces disruptive technologies. “A digital mindset will help them envision services that extend past the physical business, reaching customers, suppliers and employees, and making essential business processes available to them on their devices,” says Cowen. Whether it is SARS or any other organisation that commits itself to digital transformation, these key steps will help them achieve better results. ENDS MEDIA CONTACT: Idéle, 082 573 9219, [email protected], www.atthatpoint.co.za For more information on Teraflow.ai please visit: Website: https://teraflow.ai/ LinkedIn: https://www.linkedin.com/company/teraflow/ Facebook: https://www.facebook.com/teraflowai-107896893957011 To get real value from data science, organisations must operationalise their data and productionise their AI models. This is according to Prinavan Pillay, Director at Teraflow.ai, a data engineering firm that specialises in these services, with offices in Johannesburg, Cape Town and London.
“Once these two key problems are solved, enterprises can expect to accelerate their data science projects to the point where they are seeing major innovations in their operations,” he says. But what exactly do the terms mean? Data science as a business function “Think of data science as a factory,” suggest Pillay. “Raw material - data - goes in, flows through a repeatable production process, and AI models come out, ready to add intelligent automation to a company’s operations.” Although each model accomplishes a different task, the way in which they are created can be standardised. However, like software, models must be constantly updated to achieve greater accuracy, correct flaws, and adapt to changes in data over time. How efficient this process is will determine the speed at which a company can innovate. Operationalising data A major problem with data science is that its raw material - data - does not come neatly packaged and ready for production. Instead, it tends to be spread across enterprise databases, document folders, spreadsheets, production systems, email servers, social media and more. It’s also siloed within specific business functions, limiting its innovation potential. Before data can be used, it must be collected, cleaned, transformed into a standard format, and stored in a central data store that is readily accessible to the data science team. Instead of refining this process over time, it should be designed and implemented in advance, freeing data scientists to focus on selecting algorithms and training models. “A repeatable, highly efficient system of processing data before it is needed is the essence of operationalising data,” says Pillay. Productionising AI models As an AI system is developed, its model is trained, which is a representation of statistical patterns it can use in the future to recognise similar data. A large number of applications exist that can produce a trained AI model, and each comes with its own output format. In addition, various programming languages used throughout a company may not be able to read these formats, restricting their use in corporate systems. It turns out that getting a model into production is quite tricky. This has resulted in the rise of several model exchange environments which allow trained models to be exported and deployed to highly scalable systems. Developers can then call and import these models in whatever format suits their needs, without having to know machine learning. “As with gathering data, the method of turning models into useful enterprise objects must be standardised and repeatable, and this is what is meant by productionising AI models,” says Pillay. The right approach Manufacturers wouldn’t start producing goods until they had acquired the correct raw materials, laid their factory out in a practical manner and developed a system for delivering their finished products. The same should be true of data science. “Organisations who address the twin concerns of operationalising data and productionising AI will gain a much greater return on their data science investment, and unleash innovation across their enterprise,” concludes Pillay. ENDS MEDIA CONTACT: Stephné du Toit, 084 587 9933, [email protected], www.atthatpoint.co.za For more information on Teraflow.ai please visit: Website: https://teraflow.ai/ LinkedIn: https://www.linkedin.com/company/teraflow/ Facebook: https://www.facebook.com/teraflowai-107896893957011/ |
Welcome to the Teraflow Newsroom. ArchivesCategories
All
|