News information

Rockchip RK3308 Smart Speaker Solution Using PSRAM as Replacement for DDR3 RAM

The DDR3 market continues to face persistent shortages and high prices. As a mainstream processor for smart speakers and voice control panels (quad-core ARM Cortex-A35, integrated hardware VAD, supports 8-microphone arrays), the Rockchip RK3308 requires external memory to run Linux and complex audio applications. Traditional 128–512MB DDR3 solutions are under dual pressure from supply constraints and cost. PSRAM (Pseudo SRAM) features a simple interface, stable supply, and obvious cost advantages, with a capacity range from 64Mbit to 512Mbit. It is perfectly suitable for audio decoding and complex application scenarios in smart speakers, making it an ideal alternative amid the DDR3 shortage.

PSRAM vs DDR3: Adaptability Comparison for Smart Speaker Applications

1. Core Feature Comparison (RK3308 Application Scenario)

Comparison Item	DDR3 (Traditional Solution)	PSRAM (Alternative Solution)	Conclusion for Smart Speakers
Interface	Parallel DDR interface requiring PHY and complex clock/routing	QSPI/OPI serial interface with only 6–8 pins, no PHY needed	PSRAM enables extremely simple hardware design, lower PCB cost, and easier routing
Bandwidth	12.8GB/s (DDR3-1600)	QSPI: ~532Mbps; OPI: ~1.6Gbps	Sufficient for audio decoding, microphone array buffering, and AEC/NS algorithms; meets lossless audio and multi-mic scenarios
Capacity	64MB–512MB (mainstream)	8MB–64MB (64Mbit–512Mbit)	Covers entry-level to mid-range memory requirements for smart speakers; high-end configurations can use multiple chips in parallel
Refresh	Requires periodic refresh by the host, high software complexity	Built-in self-refresh, transparent to the system, no software intervention	PHigher development efficiency with PSRAM, enabling fast Linux porting
Supply & Cost	Shortages, large price fluctuations, long lead times	Stable supply, lower cost	PSRAM resolves supply shortages and significantly reduces BOM cost
Power Consumption	High dynamic power consumption, average standby power	Low dynamic power consumption, ultra-low standby power	Ideal for battery-powered and low-power speaker designs

Adaptability to Key Smart Speaker Scenarios

Far-field voice capture (4/6/8-mic arrays): PSRAM can buffer multi-channel audio sampling data and meet the historical frame buffering requirements of AEC/NS and dual-wakeword algorithms.
Local + cloud dual wakeword / VAD: Hardware VAD with PSRAM buffering eliminates frequent Flash access, providing the same wake response speed as DDR3 solutions.
Multi-format audio decoding (MP3/FLAC/AAC/APE): PSRAM bandwidth supports high-spec lossless decoding, with decoding buffers resident in PSRAM for smooth playback.
Linux + voice SDK (Baidu / Alibaba / iFlyTek): PSRAM acts as system cache and algorithm memory pool. After Linux kernel optimization, the voice system runs stably.
Bluetooth / Wi-Fi / DLNA / AirPlay: Network protocol stacks and audio streams can be buffered in PSRAM, ensuring smooth multitasking and supporting wireless audio casting.

RK3308 + PSRAM Replacement Solution: Hardware & Software Implementation

1. Hardware Design: RK3308 with PSRAM (QSPI/OPI)

(1) Host Selection

The solution is based on the standard RK3308 or industrial-grade RK3308B; RK3308G/H are not applicable.The RK3308 integrates a QSPI/OPI controller with direct PSRAM support, requiring no additional adapter chips.

(2) Key Hardware Connections

QSPI mode: Connect to RK3308 QSPI_CLK, CS, IO0–IO3 (6 pins total), no extra control signals required.
OPI mode: Connect IO0–IO7 (8 pins), doubling bandwidth for high-bitrate audio scenarios.
Power supply: 1.8V, compatible with RK3308 I/O voltage, no level shifter needed.
PCB design: Simple serial routing allows 2-layer boards, greatly reducing cost compared to DDR3’s 4–6-layer boards.

(3) Capacity Expansion: Multi-chip Parallel Operation

For capacities of 128MB and above, multiple PSRAM chips can be used in parallel (distinguished by CS chip select). The RK3308 supports QSPI/OPI multi-chip expansion for high-end speaker applications.

2. Software Adaptation: Linux System & PSRAM Driver Porting

(1) Kernel Configuration

Enable the RK3308 QSPI/OPI controller driver and configure PSRAM in memory-mapped mode (MMU-mapped as system memory).Optimize the Linux kernel by disabling unnecessary services and graphics modules to reduce kernel size and reserve more PSRAM for audio/voice algorithms.

(2) Memory Allocation Strategy

PSRAM dedicated area: Allocate the majority of PSRAM for audio buffers, mic array buffering, and AEC/NS algorithm memory pools.
System runtime area: Reserve remaining space for the Linux kernel, processes, and network protocol stacks.
Disable swap partition: Due to limited PSRAM bandwidth, swap is disabled to avoid performance degradation.

(3) Voice SDK Adaptation

Adjust memory allocation interfaces of mainstream voice SDKs to point algorithm variables and model buffers to the PSRAM area.Optimize VAD and wakeword detection: hardware VAD results are written directly to PSRAM, with CPU handling only post-processing to reduce load.

(4) Performance Optimization

Enable DMA data transfer: I2S audio streams and mic array data are written directly to PSRAM via DMA with zero CPU intervention, improving real-time performance.
Optimize audio decoding libraries: Decoding buffers reside in PSRAM to reduce Flash access and ensure controllable decoding latency.

PSRAM Selection Guidelines for Speakers with Different Configurations

1. Entry-level Smart Speaker (2 mics, local wakeword, basic audio decoding)

Host: Standard RK3308
PSRAM capacity: Low-capacity PSRAM
System: Optimized Linux + lightweight voice SDK
Advantage: Lowest cost, most stable supply, meets basic entry-level needs

2. Mid-range Smart Speaker (4 mics, AEC/NS, local + cloud dual wakeword)

Host: Industrial RK3308B (with CAN, suitable for gateway scenarios)
PSRAM capacity: Mid-capacity PSRAM, OPI interface recommended
System: Full Linux + mainstream voice SDK
Advantage: Sufficient bandwidth, stable operation, ideal for mass-production mainstream configurations

3. High-end Smart Speaker (8 mics, lossless decoding, multi-protocol concurrency)

Host: Industrial RK3308B
PSRAM capacity: High-capacity PSRAM or multiple chips in parallel
System: Full Linux + full-featured voice SDK + high-definition audio decoding
Advantage: Performance close to DDR3 solutions, with clear supply and cost benefits

Core Advantages of PSRAM Over DDR3

Supply chain relief: Solves DDR3 shortages and long lead times; PSRAM supply is stable to ensure continuous mass production.
Significant cost reduction: Lower BOM and PCB costs, ideal for high-volume smart speaker products.
Perfect scenario matching: Bandwidth and capacity fully cover audio decoding and complex application needs of RK3308 smart speakers and voice control panels.
Low migration cost: Native QSPI/OPI PSRAM support on RK3308, mature Linux drivers, and minimal software adaptation effort.

Amid the ongoing DDR3 shortage, the RK3308/RK3308B + PSRAM combination has become a mature alternative for smart speakers and voice control products. It delivers stability, cost efficiency, and mass-producibility, representing the optimal solution at this stage.

Last： Domestic Alternative First Choice | KTH5701AQ3 by Conntek | Full Compatibility with MLX90393, Direct Replacement Without Board Modification Next： period

Relevant news

MSTM
SiTime

Microcontroller

Artery

Analog IC

DCDC

Sensor

CONNTEK

News information

contact us

Telephone: +86-150-1290-5940

Mobile phone: +86-150-1290-5940

Mailbox: sales@manduic.com

Address: Room 618, 6th Floor, Derun Building, No. 366 Chaofeng Road, Fenghuang Street, Guangming District, Shenzhen, China

News information

Rockchip RK3308 Smart Speaker Solution Using PSRAM as Replacement for DDR3 RAM

PSRAM vs DDR3: Adaptability Comparison for Smart Speaker Applications

1. Core Feature Comparison (RK3308 Application Scenario)

Adaptability to Key Smart Speaker Scenarios

RK3308 + PSRAM Replacement Solution: Hardware & Software Implementation

1. Hardware Design: RK3308 with PSRAM (QSPI/OPI)

(1) Host Selection

(2) Key Hardware Connections

(3) Capacity Expansion: Multi-chip Parallel Operation

2. Software Adaptation: Linux System & PSRAM Driver Porting

(1) Kernel Configuration

(2) Memory Allocation Strategy

(3) Voice SDK Adaptation

(4) Performance Optimization

PSRAM Selection Guidelines for Speakers with Different Configurations

1. Entry-level Smart Speaker (2 mics, local wakeword, basic audio decoding)

2. Mid-range Smart Speaker (4 mics, AEC/NS, local + cloud dual wakeword)

3. High-end Smart Speaker (8 mics, lossless decoding, multi-protocol concurrency)

Core Advantages of PSRAM Over DDR3

Relevant news

Product Center

EEPROM

NOR Flash

NAND FLASH

RAM Memory

eMMC

Oscillator

MEMS Oscillator

Microcontroller

Analog IC

Sensor

News information

contact us