• SpiNNaker: a multi-core System-on-Chip for massively-parallel neural net simulation

      Painkras, Eustace; Plana, Luis A.; Garside, Jim D.; Temple, Steve; Davidson, Simon; Pepper, Jeffrey; Clark, David; Patterson, Cameron; Furber, Steve B. (IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2012)
      The modelling of large systems of spiking neurons is computationally very demanding in terms of processing power and communication. SpiNNaker is a massively-parallel computer system designed to model up to a billion spiking neurons in real time. The basic block of the machine is the SpiNNaker multicore System-on-Chip, a Globally Asynchronous Locally Synchronous (GALS) system with 18 ARM968 processor nodes residing in synchronous islands, surrounded by a light-weight, packet-switched asynchronous communications infrastructure. The MPSoC contains 100 million transistors in a 102 mm2 die, provides a peak performance of 3.96 GIPS and has a power consumption of 1W at 1.2V when all processor cores operate at nominal frequency. SpiNNaker chips were delivered in May 2011, were fully operational, and met power and performance requirements.
    • SpiNNaker: design and implementation of a GALS multicore system-on-chip

      Plana, Luis A.; Clark, David; Davidson, Simon; Furber, Steve B.; Garside, Jim D.; Painkras, Eustace; Pepper, Jeffrey; Temple, Steve; Bainbridge, John (ACM, 2011)
      The design and implementation of Globally Asynchronous Locally Synchronous Systems-on-Chip is a challenging activity. The large size and complexity of the systems require the use of Computer-Aided Design (CAD) tools but, unfortunately, most tools do not work adequately with asynchronous circuits. This paper describes the successful design and implementation of SpiNNaker, a GALS multi-core system-on-chip. The processes was completed using commercial CAD tools from synthesis to layout. A hierarchical methodology was devised to deal with the asynchronous sections of the system, encapsulating and validating timing assumptions at each level. The crossbar topology combined with a pipelined asynchronous fabric implementation allows the on-chip network to meet the stringent requirements of the system. The implementation methodology constrains the design in a way which allows the tools to complete their tasks successfully. A first test chip, with reduced resources and complexity was taped-out using the proposed methodology. Test chips were received in December 2009 and were fully functional. The methodology had to be modified to cope with the increased complexity of the SpiNNaker SoC. SpiNNaker chips were delivered in May 2011 and were also fully operational, and the interconnect requirements were met.