Krishna T. Malladi

3655 N First St · San Jose, CA 95134 · (408) 544-4000 · k.tej@samsung.com

I am a computer system scientist, pursuing R&D in the area of server memory and storage systems. With more than 8 years of industrial and academic experience, I work on all aspects of systems design i.e. server architecture, software/kernel ecosystem, workload analysis, memory, storage technologies, PoC emulation.

Experience

Staff R&D Engineer

Senior R&D Engineer

Samsung Semiconductor, Inc, San Jose, CA

• Research server memory, storage architectures that improve DDR4 and HBM2 performance, capacity
• Develop PoC systems to demonstrate server integration on FPGA with hardware, software implementation

2016 - Present
2013 - 2016

Engineering intern

Rambus Inc, Sunnyvale, CA

• High performnace DRAM serial link interfaces to suit powermodes for datacenter applications
• Modifications to on-chip timing architecture for energy-proportional datacenter memory

2011

Engineering intern

Google Inc, Mountain View, CA

• Quantify Google online datacenter applications with Kernel tools and memory traces for Google websearch
• Statistical framework that accurately detects small performance changes in websearch production clusters

2010

Engineering intern

Qualcomm Inc, San Diego, CA

• Timing enhancements to FFT core engine in MediaFLO UBM2 Modem for higher bandwidth
Won the Roberto Padovani award
• Developed a real time debugger for the modem emulation platform

2008

Education

Stanford University

Ph.D., Dept. of Electrical Engineering
Dissertation: Energy Proportional Memory Systems
Advisors: Prof. Mark Horowitz and Prof. Christos Kozyrakis
Benchmark Capital Stanford Fellow
GPA: 3.98
2009 - 2013

Indian Institute of Technology, Kanpur

M.Tech. (Dual-Degree), Dept. of Electrical Engineering

Dissertation: “Design of SoC for Network Based RFID Applications”
Academic Proficiency Medal
GPA: 9.6

2008 - 2009

Indian Institute of Technology, Kanpur

B.Tech., Dept. of Electrical Engineering
Top of Graduating EE dual degree class

GPA: 9.7

2004 - 2008

Skills

Programming Languages & Tools
Workflow
  • Mobile-First, Responsive Design
  • Cross Browser Testing & Debugging
  • Cross Functional Teams
  • Agile Development & Scrum

publications

JOURNALS

[J4] M. Gao, C. Delimitrou, D. Niu, K. Malladi , H. Zheng, B. Brennan, C. Kozyrakis. “DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric,” in IEEE MICRO Top Picks, Volume 37, Issue 3, pp. 70-78, 2017.

[J2] A. Boroumand, S. Ghose, B. Lucia, K. Hsieh, K. Malladi , H. Zheng, O. Mutlu. “LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory,” in IEEE Computer Architecture Letters, Volume 16, Issue 1, pp. 46-50, 2016.

[J3] M. Gao, C. Delimitrou, D. Niu, K. Malladi , H. Zheng, B. Brennan, C. Kozyrakis. “DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric,” in ACM SIGARCH Computer Architecture News, Volume 44, Issue 3, pp. 506-518, 2016.

[J1] K. Malladi , F. Nothaft, K. Periyathambi, B. Lee, C. Kozyrakis, M. Horowitz. “Towards Energy-Proportional Datacenter Memory with Mobile DRAM,” in ACM SIGARCH Computer Architecture News, Volume 40, Issue 3, pp. 37-48, 2012.

PEER-REVIEWED CONFERENCES

[C15] S. Li, D. Niu, K. Malladi , H. Zheng. “DRISA: A DRAM-based Reconfigurable In-Situ Accelerator” in 50th IEEE/ACM International Symposium on Microarchitecture (MICRO), Boston, October 2017.

[C14] Q. Xu, K. Malladi , M. Awasthi. “Rack Level Scheduling for Containerized Workloads” in 27th IEEE International Conference on Network, Architecture and Storage (NAS), Shenzen, China, August 2017.

[C13] K. Malladi , M. Chang, D. Niu, H. Zheng. “FlashStorageSim: Performance Modeling for SSD Architectures” in 27th IEEE International Conference on Network, Architecture and Storage (NAS), Shenzen, China, August 2017.

[C12] Q. Xu, M. Awasthi, K. Malladi , J. Bhimani, J. Yang, M. Annavaram. “Performance Analysis of Containerized Applications on Local and Remote Storage” in 34th International Conference on Massive Storage Systems and Technology (MSST), Santa Clara, May 2017.

[C11] Q. Xu, M. Awasthi, K. Malladi , J. Bhimani, J. Yang, M. Annavaram. “Docker Characterization on High Performance SSDs” in 18th IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Santa Rosa, April 2017.

[C10] M. Awasthi, K. Malladi . “KOVA : A Tool for Kernel Visualization and Analysis” in 35th IEEE International Performance Computing and Communications Conference (IPCCC), Las Vegas, December 2016.

[C9] K. Malladi , M. Awasthi, H.Zheng. “FlexDrive: A Framework to Explore NVMe Storage Solutions” in 18th IEEE International Conference on High Performance Computing and Communications (HPCC), Sydney, Australia, December 2016.

[C8] K. Malladi , M. Awasthi, H.Zheng. “DRAMPersist: Making DRAM Systems Persistent” in 2nd ACM International Symposium on Memory Systems (MEMSYS), Washington D.C, October 2016.

[C7] K. Malladi , U. Kong, M. Awasthi, H.Zheng. “DRAMScale: Mechanisms to Increase DRAM Capacity” in 2nd ACM International Symposium on Memory Systems (MEMSYS), Washington D.C, October 2016.

[C6] K. Malladi , M. Awasthi, H.Zheng. “Software-Defined Emulation Infrastructure for High Speed Storage” in 9th ACM International on Systems and Storage Conference (SYSTOR), Haifa, Israel, June 2016.

[C5] M. Gao, C. Delimitrou, D. Niu, K. Malladi , H. Zheng, B. Brennan, C. Kozyrakis. “DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric,” in 43rd ACM/IEEE International Symposium on Computer Architecture (ISCA), Seoul, Korea, June 2016.

[C4] K. Malladi , M. Chang, J. Ping, H.Zheng. “FAME: A Fast and Accurate Memory Emulator for New Memory System Architecture Exploration”, in 23rd IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), Atlanta, October 2015.

[C3] K. Malladi , I. Shaeffer, L. Gopalakrishnan, D. Lo, B.C. Lee, M. Horowitz. “Rethinking DRAM Powermodes for Energy Proportionality”, in 45th IEEE/ACM International Symposium on Microarchitecture (MICRO), Vancouver, December 2012.

[C2] K. Malladi , F.Nothaft, K. Periyathambi, B. Lee, C. Kozyrakis, M. Horowitz. “Towards Energy-proportional Datacenter Memory with Mobile DRAM,” in 39th ACM/IEEE International Symposium on Computer Architecture (ISCA), Portland, June 2012.

[C1] K. Malladi and David.V.Anderson, “Analog Implementation of SNR Based Gain Adaptation for Denoising”, in 42nd IEEE International Symposium on Circuits and Systems (ISCAS), May 2009.

CONFERENCE PREPRINTS AND POSTERS

[P2] Q. Xu, K. Malladi , M. Awasthi. “Rack Level Scheduling for Containerized Workloads” in 8th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys), Mumbai, India, September 2017.

[P1] A. Boroumand, S. Ghose, B. Lucia, K. Hsieh, K. Malladi , H. Zheng, O. Mutlu. “LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory,” in arXiv preprint arXiv:1706.03162, June 2016.

Granted Patents

[I10] US 9934154 B2 “Electronic system with memory management mechanism and method of operation thereof”

[I9] US 9954533 B2 “DRAM-Based reconfigurable logic”

[I8] US 9922696 B1 “Circuits and micro-architecture for a DRAM-Based processing unit”

[I7] US 15/905348 B2 “Electronic system with memory management mechanism and method of operation thereof”

[I6] US 9761296 B2 “Smart in-module refresh for DRAM”

[I5] US 9966152 B2 “Dedupe DRAM system algorithm architecture”

[I4] US 9577644 B2, CN 105703765 “A Reconfigurable logic architecture”

[I3] US 9524769 B2 “Smart in-module refresh for DRAM”

[I2] US 9727239 B2 “Electronic system with partitioning mechanism and method of operation thereof”

[I1] US 9503095 B2, CN 105703765 “A Space-multiplexing DRAM-based reconfigurable logic”

PEER REVIEWING FOR JOURNALS, CONFERENCES

Program Committee Member: ACM/IEEE NAS-2018, CF-2018, DAC-2018, NAS-2017, DAC-2017, CF2017, ISVLSI-2017, ISPASS-2015, ISCA Memory-Forum-2014

External Program Committee Member: ACM/IEEE ICS-2017, ISLPED-2016, HPCA-2014, MICRO-2014, MICRO-2013, ISCA-2011.

Journal Editoral Committee: IEEE Transactions on NanoTechnology-2018, Computer Architecture Letters-2015, Transactions on Computers-2013.

NSF Proposal, 2012, “Energy Efficient Memory Systems for Multi-core Chips”

NSF Proposal, 2010, “Heterogeneous Memory Systems for Energy Efficiency

Awards

  • IEEE MICRO-TOP PICKS-2017 for novelty and long-term impact in computer architecture
  • Samsung Memory Global R&D Excellence award, 2017
  • Samsung Memory President’s award for Best Project, 2014
  • Google Best-focused research appraisal, 2010
  • Benchmark Capital Stanford Fellowship Award, 2009
  • Academic Proficency Medal, IIT Kanpur, 2009
  • Qualcomm Roberto Padovani award for best R&D Innovation, 2008

press

Robert McMillan. “Data centers eye second raid on your cellphone”, Wired, March 2013

Chris Edwards. “The wrong units of compute”, Tech Design Forums, April 2013

“Re-building the server - microprocessors to memory moves” Journal of the Institute of Engineering and Technology, September 2013

Talks and Teaching

AMD Inc, 2017
Wave Computing, 2017
IEEE NAS-27, Aug17
IEEE HPCC-18, Dec16
ACM MEMSYS-2, Washington D.C, Oct16
ACM SYSTOR-9, June16.
IEEE MASCOTS-23, Atlanta, Oct15.
Keynote speech on “Building next generation server systems” at EDCS, HPCA-2014, Orlando, Feb’14.
Intel Labs R&D, Santa Clara, Apr’13
Samsung R&D, San Jose, May’13
Qualcomm Processor R&D, Raleigh, Jun’13
Qualcomm SoC R&D, San Diego, Jun’13
Center for Integrated Systems, Stanford, Nov’12
IEEE/ACM MICRO-45, Vancouver, Canada, Dec’12
Pervasive Parallelism Lab Retreat, Santa Cruz, Dec’12.
ACM/IEEE ISCA-39, Portland, Jun’12.
Pervasive Parallelism Lab Retreat, San Francisco, Jun’12.
Rambus Labs, Sunnyvale, Oct’11.
Computer Forum, Stanford University, Jun’11.
Google Inc, Moutain View, Nov&Mar’10.
IEEE ISCAS-42, May’09.
Qualcomm, San Diego, Jul’08.
Indo-German Winter Academy (VLSI), IIT Guwahati.
Teaching Assistant for Microelectronic Circuits for 75 students, IIT Kanpur.
Teaching Assistant for Analog Circuits Laboratory for 25 students, IIT Kanpur.