# KRISHNA TEJA MALLADI

3655 N First St, San Jose, CA, 95134 https://www.linkedin.com/in/krishna-teja-malladi-190b4828 ktej.iitk@gmail.com, 650-796-5810

#### STATEMENT

I am a computer system scientist, pursuing R&D in the area of server memory and storage systems. With more than 8 years of industrial and academic experience, I work on all aspects of systems design i.e. server architecture, software/kernel ecosystem, workload analysis, memory, storage technologies, PoC emulation.

#### **EDUCATION**

| <ul> <li>Stanford University</li> <li>Ph.D in Electrical Engineering</li> <li>Advisors: Prof. Mark Horowitz and Prof. Christos Kozyrakis</li> <li>Dissertation: "Energy Proportional Memory Systems"</li> </ul> | 2009-2013<br>GPA: 3.98/4.0 |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------|
| Indian Institute of Technology, Kanpur, India<br>M.Tech in Electrical Engineering<br>Dissertation: "Design of SoC for Network Based RFID Applications"<br>Academic Proficiency Medal                            | 2008-2009<br>GPA: 9.6/10.0 |
| Indian Institute of Technology, Kanpur, India<br>B.Tech in Electrical Engineering<br>Top of Graduating EE dual degree class                                                                                     | 2004-2008<br>GPA: 9.7/10.0 |

## PUBLICATIONS

## JOURNALS

[J4] M. Gao, C. Delimitrou, D. Niu, K. Malladi, H. Zheng, B. Brennan, C. Kozyrakis. "DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric," in *IEEE MICRO Top Picks*, Volume 37, Issue 3, pp. 70-78, 2017.

[J2] A. Boroumand, S. Ghose, B. Lucia, K. Hsieh, K. Malladi, H. Zheng, O. Mutlu. "LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory," in *IEEE Computer Architecture Letters*, Volume 16, Issue 1, pp. 46-50, 2016.

[J3] M. Gao, C. Delimitrou, D. Niu, K. Malladi, H. Zheng, B. Brennan, C. Kozyrakis. "DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric," in *ACM SIGARCH Computer Architecture News*, Volume 44, Issue 3, pp. 506-518, 2016.

[J1] K. Malladi, F. Nothaft, K. Periyathambi, B. Lee, C. Kozyrakis, M. Horowitz. "Towards Energy-Proportional Datacenter Memory with Mobile DRAM," in *ACM SIGARCH Computer Architecture News*, Volume 40, Issue 3, pp. 37-48, 2012.

## PEER-REVIEWED CONFERENCES

[C15] S. Li, D. Niu, K. Malladi, H. Zheng. "DRISA: A DRAM-based Reconfigurable In-Situ Accelerator" in 50th IEEE/ACM International Symposium on Microarchitecture (MICRO), Boston, October 2017.

[C14] Q. Xu, K. Malladi, M. Awasthi. "Rack Level Scheduling for Containerized Workloads" in 27th IEEE International Conference on Network, Architecture and Storage (NAS), Shenzen, China, August 2017.

[C13] K. Malladi, M. Chang, D. Niu, H. Zheng. "FlashStorageSim: Performance Modeling for SSD Architectures" in 27th IEEE International Conference on Network, Architecture and Storage (NAS), Shenzen, China, August 2017.

[C12] Q. Xu, M. Awasthi, K. Malladi, J. Bhimani, J. Yang, M. Annavaram. "Performance Analysis of Containerized Applications on Local and Remote Storage" in 34th International Conference on Massive Storage Systems and Technology (MSST), Santa Clara, May 2017. [C11] Q. Xu, M. Awasthi, K. Malladi, J. Bhimani, J. Yang, M. Annavaram. "Docker Characterization on High Performance SSDs" in 18th IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Santa Rosa, April 2017.

[C10] M. Awasthi, K. Malladi. "KOVA : A Tool for Kernel Visualization and Analysis" in 35th IEEE International Performance Computing and Communications Conference (IPCCC), Las Vegas, December 2016.

[C9] K. Malladi, M. Awasthi, H.Zheng. "FlexDrive: A Framework to Explore NVMe Storage Solutions" in 18th IEEE International Conference on High Performance Computing and Communications (HPCC), Sydney, Australia, December 2016.

[C8] K. Malladi, M. Awasthi, H.Zheng. "DRAMPersist: Making DRAM Systems Persistent" in 2nd ACM International Symposium on Memory Systems (MEMSYS), Washington D.C, October 2016.

[C7] K. Malladi, U. Kong, M. Awasthi, H.Zheng. "DRAMScale: Mechanisms to Increase DRAM Capacity" in 2nd ACM International Symposium on Memory Systems (MEMSYS), Washington D.C, October 2016.

[C6] K. Malladi, M. Awasthi, H.Zheng. "Software-Defined Emulation Infrastructure for High Speed Storage" in 9th ACM International on Systems and Storage Conference (SYSTOR), Haifa, Israel, June 2016.

[C5] M. Gao, C. Delimitrou, D. Niu, K. Malladi, H. Zheng, B. Brennan, C. Kozyrakis. "DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric," in 43rd ACM/IEEE International Symposium on Computer Architecture (ISCA), Seoul, Korea, June 2016.

[C4] K. Malladi, M. Chang, J. Ping, H.Zheng. "FAME: A Fast and Accurate Memory Emulator for New Memory System Architecture Exploration", in 23rd IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), Atlanta, October 2015.

[C3] K. Malladi, I. Shaeffer, L. Gopalakrishnan, D. Lo, B.C. Lee, M. Horowitz. "Rethinking DRAM Powermodes for Energy Proportionality", in 45th IEEE/ACM International Symposium on Microarchitecture (MICRO), Vancouver, December 2012.

[C2] K. Malladi, F.Nothaft, K. Periyathambi, B. Lee, C. Kozyrakis, M. Horowitz. "Towards Energy-proportional Datacenter Memory with Mobile DRAM," in 39th ACM/IEEE International Symposium on Computer Architecture (ISCA), Portland, June 2012.

[C1] K.Malladi and David.V.Anderson, "Analog Implementation of SNR Based Gain Adaptation for Denoising", in 42nd IEEE International Symposium on Circuits and Systems (ISCAS), May 2009.

# CONFERENCE PREPRINTS AND POSTERS

[P2] Q. Xu, K. Malladi, M. Awasthi. "Rack Level Scheduling for Containerized Workloads" in 8th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys), Mumbai, India, September 2017.

[P1] A. Boroumand, S. Ghose, B. Lucia, K. Hsieh, K. Malladi, H. Zheng, O. Mutlu. "LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory," in *arXiv preprint arXiv:1706.03162*, June 2016.

## **GRANTED PATENTS**

[I10] US 9934154 B2 "Electronic system with memory management mechanism and method of operation thereof".

[I9] US 9954533 B2 "DRAM-Based reconfigurable logic".

[I8] US 9922696 B1 "Circuits and micro-architecture for a DRAM-Based processing unit".

**[I7]** US 15/905348 B2 "Electronic system with memory management mechanism and method of operation thereof".

[I6] US 9761296 B2 "Smart in-module refresh for DRAM".

- [I5] US 9966152 B2 "Dedupe DRAM system algorithm architecture".
- [I4] US 9577644 B2, CN 105703765 "A Reconfigurable logic architecture".
- [I3] US 9524769 B2 "Smart in-module refresh for DRAM".
- [I2] US 9727239 B2 "Electronic system with partitioning mechanism and method of operation thereof".
- [I1] US 9503095 B2, CN 105703765 "A Space-multiplexing DRAM-based reconfigurable logic".

\*27 other inventions pending at USPTO and worldwide

## INDUSTRY EXPERIENCE

| Samsung Semiconductor, Inc, San Jose, CA : Staff R&D Engineer                                                       | Current-Aug'13 |
|---------------------------------------------------------------------------------------------------------------------|----------------|
| • Research server memory, storage architectures that improve DDR4 and HBM2 performance, capacity                    |                |
| • Develop PoC systems to demonstrate server integration on FPGA with hardware, software implementation              |                |
| Rambus Inc, Sunnyvale, CA                                                                                           | 2011           |
| • High performnace DRAM serial link interfaces to suit powermodes for datacenter applications                       |                |
| • Modifications to on-chip timing architecture for energy-proportional datacenter memory                            |                |
| Google Inc, Mountain View, CA                                                                                       | 2010           |
| • Qunatify Google online datacenter applications with Kernel tools and memory traces for Google websearch           |                |
| • Statistical framework that accurately detects small performance changes in websearch production clusters          |                |
| Qualcomm Inc, San Diego, CA                                                                                         | 2008           |
| • Timing enhancements to FFT core engine in MediaFLO UBM2 Modem for higher bandwidth Won the Roberto Padovani award |                |
| • Developed a real time debugger for the modem emulation platform                                                   |                |

## PEER REVIEWING FOR JOURNALS, CONFERENCES, GRANTS

Program Committee Member: ACM/IEEE NAS-2018, CF-2018, DAC-2018, NAS-2017, DAC-2017, CF-2017, ISVLSI-2017, ISPASS-2015, ISCA Memory-Forum-2014,

External Program Committee Member: ACM/IEEE ICS-2017, ISLPED-2016, HPCA-2014, MICRO-2014, MICRO-2013, ISCA-2011.

Journal Editoral Committee: IEEE Transactions on NanoTechnology-2018, Computer Architecture Letters-2015, Transactions on Computers-2013.

NSF Proposal, 2012, "Energy Efficient Memory Systems for Multi-core Chips".

NSF Proposal, 2010, "Heterogeneous Memory Systems for Energy Efficiency".

## AWARDS AND PRESS COVERAGE

IEEE MICRO-TOP PICKS-2017 for novelty and long-term impact in computer architecture

Samsung Memory Global R&D Excellence award, 2017.

Samsung Memory President's award for Best Project, 2014.

Robert McMillan. "Data centers eye second raid on your cellphone", Wired, March 2013.

Chris Edwards. "The wrong units of compute", Tech Design Forums, April 2013.

"Re-building the server - microprocessors to memory moves" Journal of the Institute of Engineering and Technology, September 2013.

Google Best-focused research appraisal, 2010.

Benchmark Capital Stanford Fellowship Award, 2009.

Academic Proficency Medal, IIT Kanpur, 2009.

Qualcomm Roberto Padovani award for best R&D Innovation, 2008.

# SELECTED TALKS AND TEACHING

AMD Inc, 2017 Wave Computing, 2017 IEEE NAS-27, Aug17 IEEE HPCC-18, Dec16 ACM MEMSYS-2, Washington D.C, Oct16 ACM SYSTOR-9, June16 IEEE MASCOTS-23, Atlanta, Oct15 Keynote speech on "Building next generation server systems" at EDCS, HPCA-2014, Orlando, Feb'14. Intel Labs R&D, Santa Clara, Apr'13 Samsung R&D, San Jose, May'13 Qualcomm Processor R&D, Raleigh, Jun'13 Qualcomm SoC R&D, San Diego, Jun'13 Center for Integrated Systems, Stanford, Nov'12 IEEE/ACM MICRO-45, Vancouver, Canada, Dec'12 Pervasive Parallelism Lab Retreat, Santa Cruz, Dec'12. ACM/IEEE ISCA-39, Portland, Jun'12 Pervasive Parallelism Lab Retreat, San Francisco, Jun'12. Rambus Labs, Sunnyvale, Oct'11. Computer Forum, Stanford University, Jun'11. Google Inc, Moutain View, Nov&Mar'10. IEEE ISCAS-42, May'09. Qualcomm, San Diego, Jul'08. Indo-German Winter Academy (VLSI), IIT Guwahati. Teaching Assistant for Microelectronic Circuits for 75 students, IIT Kanpur. Teaching Assistant for Analog Circuits Laboratory for 25 students, IIT Kanpur.