Machine learning is a broad field and encompasses many disciplines, but the fundamental idea is that a machine improves its performance on a task by learning from past experience. Machine learning is improving the performance of an explosion of applications, from driverless cars to systems that recommend the show you should watch tonight; the two types of machine learning are:
- Supervised Learning – Where the “answer” for the task is already known and the machine is told that answer in order to refine and improve its performance.
- Google utilizes supervised learning to refine the results of Google Images and Google Translate. User interaction with millions of pieces of data helps the system learn, making it more accurate with each exercise.
- Protein classification and bioinformatics have been aided by support vector machines (SVMs) that can rapidly organize and classify with “good enough” performance.
- Decision tree learning is useful when information is accumulated about an object or situation and the “branches” occur as new facts are learned until a classification (leaf) is reached (as in portfolio diversification for individual investors.)
- Unsupervised Learning – The machine finds structure in a body of data on its own.
- With k-means clustering algorithm, for instance, a large body of data, with potentially many dimensions of input, is organized so that a number of exemplars can be used for later classification, such as systems picking music a listener might like.
- To enable voice authentication, for instance, a large number of people’s voices are used to create a universal background model (UBM), allowing a specific person’s variances to create an identifiable pattern for authentication.
- Bayesian inference is used to update probability of future outcomes based on new information becoming available and is used by pharmaceutical companies to decide whether to enter phase 3 clinicals for potential new drugs.
The KNUPATH Hermosa architecture excels in the following areas:
- Many Independent Processes – Each of the 256 cores in the Hermosa can run its own program. This is particularly useful for Genetic Algorithms in which many “individuals” are simulated, and the population evolves through different behaviors, with some individuals with poor behavior being culled in each generation and new individuals are created.
- Sparse Matrices – The Hermosa architecture is much better suited to sparse matrix calculations than are graphics processing units (GPUs), because memory accesses are far faster on Hermosa than global memory accesses on a GPU. Because GPUs are so poor using sparse matrices researchers are avoiding them in their learning designs. So, for example, from the GoogLeNet paper “…todays computing infrastructures are very inefficient when it comes to numeric calculation on non-uniform sparse data structures. Even if the number of arithmetic operations is reduced by 100x, the overhead of lookups and cache misses is so dominant that switching to sparse matrices would not pay off.” Researchers know that sparse connections are the rule in the biological world, but are not able to use them efficiently because of the computing equipment available to them.
- Matrix times Vector – Many machine learning and signal processing calculations reduce to matrix times vector calculations. Hermosa is very efficient at implementing these calculations as opposed to GPUs, which only do well with dense matrix times dense matrix. This is because GPUs have so much latency accessing global memory, whereas Hermosa can access its memory in 10 to 20 clock cycles (10-20 ns).
- Data Flow Computing – Hermosa excels in data flow calculations in which disparate functions are used in the machine learning algorithms. This is because each core in Hermosa can communicate with little latency to any other core. This supports the way biological brains perform computing. In general, Hermosa is likely to perform well on problems normally performed using FPGAs.
- High Bandwidth, Low Computation – Because data communication is so efficient on Hermosa it performs well on problems that stream through large amounts of data compared to the amount of computation that is performed on each core. An example of this is k-means clustering, which also requires other strengths of Hermosa including data sharing, synchronization between epochs of the computation, and large numbers of branching instructions.