Mominul's Blog

250+ PostgreSQL Interview Q&A 2026 | PostgreSQL Developer • DBA • Cloud • AI | FreeLearning365

🐣 PostgreSQL Developer — Beginner 0-2 Yrs

▼

Q1What is PostgreSQL and what makes it unique among SQL databases?

PostgreSQL is an advanced, open-source object-relational database system. It emphasizes extensibility, standards compliance (ACID, SQL), and supports modern features like JSONB, full-text search, and custom types. It's often called the "world's most advanced open-source database."

Q2Explain the PostgreSQL architecture: shared memory, background processes, and storage.

PostgreSQL uses a multi-process model (postmaster spawns backends). Key processes: writer, WAL writer, autovacuum, stats collector. Shared memory includes buffer cache and WAL buffers. Data is stored in pages (8KB) with MVCC via tuple versions.

Q3How do you create a database and user in PostgreSQL? Provide a secure example.

CREATE DATABASE ecommerce OWNER app_admin;
CREATE USER app_user WITH PASSWORD 'Str0ng!Pass';
GRANT CONNECT ON DATABASE ecommerce TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;

Q4What is MVCC and why is it crucial for high concurrency?

Multi-Version Concurrency Control keeps multiple versions of a row. Readers don't block writers and vice versa. Each transaction sees a snapshot of the database, providing high concurrency without heavy locking. VACUUM eventually cleans up old versions.

Q5Explain the basic data types: INTEGER, NUMERIC, VARCHAR, TEXT, BOOLEAN, DATE, TIMESTAMP, JSONB.

PostgreSQL offers rich types. NUMERIC(precision, scale) for exact money, TEXT for large strings, VARCHAR(n) for limited text. JSONB stores binary JSON with indexing. DATE and TIMESTAMP (with/without time zone) are crucial for global apps.

Q6How do you write a simple PL/pgSQL function?

CREATE FUNCTION get_total_orders(cust_id INT) RETURNS INT AS $$
BEGIN RETURN (SELECT COUNT(*) FROM orders WHERE customer_id = cust_id); END;
$$ LANGUAGE plpgsql;

Q7What is a primary key? Why use a sequence or SERIAL?

A primary key uniquely identifies a row. SERIAL creates an auto-incrementing integer. Use IDENTITY columns (GENERATED AS IDENTITY) in modern PostgreSQL.

Q8How do you perform a JOIN? INNER, LEFT, RIGHT, FULL OUTER.

SELECT c.name, o.amount FROM customers c INNER JOIN orders o ON c.id = o.cust_id;

Q9What is an index? B-tree vs GIN vs GiST.

B-tree for equality/range. GIN for full-text/arrays/JSONB. GiST for geometric data. Appropriate indexes can turn a full scan into an index lookup.

Q10Explain the difference between DELETE, TRUNCATE, and DROP in PostgreSQL.

DELETE removes rows with WHERE, can be rolled back. TRUNCATE removes all rows quickly but is transactional? Yes, in PostgreSQL TRUNCATE is MVCC-safe and can be rolled back. DROP removes the table.

Q11How do you use GROUP BY and HAVING for business reports?

SELECT department, AVG(salary) FROM employees GROUP BY department HAVING AVG(salary) > 50000;

Q12What is a VIEW and how does it simplify complex queries?

A view is a saved query. Use for security (hide columns), abstraction, and reusable logic. Materialized views cache results.

Q13Explain the EXPLAIN command and its output.

EXPLAIN SELECT * FROM orders WHERE amount > 1000;

Q14How do you handle NULL values with COALESCE and NULLIF?

SELECT COALESCE(phone, 'N/A') FROM contacts;
SELECT NULLIF(division, 0) FROM stats;

Q15What is a transaction? ACID in PostgreSQL.

BEGIN; UPDATE ...; SAVEPOINT sp1; ... ROLLBACK TO sp1; COMMIT;

Q16How do you import/export data using COPY and pg_dump?

\COPY products FROM '/data/products.csv' CSV HEADER;
pg_dump mydb > backup.sql

Q17What are schemas and search_path?

Schemas organize objects. search_path determines which schema is checked first for unqualified names. Typical pattern: `public` for app, separate schemas per module.

Q18Explain the difference between VARCHAR and TEXT.

VARCHAR(n) checks length; TEXT is unlimited. Performance is identical; prefer TEXT for flexibility unless you need a constraint.

Q19How do you use the SERIAL type and IDENTITY columns?

CREATE TABLE users (id SERIAL PRIMARY KEY); -- old
CREATE TABLE users (id INT GENERATED ALWAYS AS IDENTITY); -- modern

Q20What is the purpose of VACUUM? Why is it important?

VACUUM reclaims storage occupied by dead tuples and updates statistics for the planner. Without it, table bloat and performance degrade. Autovacuum automates this.

Q21How do you use string functions like SPLIT_PART, REGEXP_REPLACE?

SELECT SPLIT_PART(email, '@', 2) AS domain FROM users;

Q22Explain date/time handling: INTERVAL, AT TIME ZONE.

SELECT NOW() - INTERVAL '7 days';
SELECT order_time AT TIME ZONE 'UTC' AT TIME ZONE 'US/Eastern';

Q23What is a SUBQUERY? Correlated subquery example.

SELECT * FROM employees e WHERE salary > (SELECT AVG(salary) FROM employees WHERE dept = e.dept);

Q24How do you use LIMIT and OFFSET for pagination?

SELECT * FROM products ORDER BY id LIMIT 20 OFFSET 40;

Q25What are domains and check constraints?

CREATE DOMAIN positive_int AS INT CHECK (VALUE > 0);
ALTER TABLE products ADD CONSTRAINT price_positive CHECK (price > 0);

Q26How do you find duplicate rows?

SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1;

Q27What is a CTE (WITH clause)? Basic example.

WITH recent_orders AS (SELECT * FROM orders WHERE order_date > CURRENT_DATE - 30)
SELECT customer_id, SUM(amount) FROM recent_orders GROUP BY customer_id;

Q28How do you rename a table or column?

ALTER TABLE old_name RENAME TO new_name;
ALTER TABLE orders RENAME COLUMN amt TO amount;

Q29Explain the difference between UNION and UNION ALL.

UNION removes duplicates; UNION ALL keeps all. Use UNION ALL unless you need distinct results; it's faster.

Q30How do you comment on database objects for documentation?

COMMENT ON TABLE employees IS 'Stores employee data';
COMMENT ON COLUMN employees.salary IS 'Annual salary in USD';

🐥 PostgreSQL Developer — Intermediate 2-5 Yrs

▼

Q31Explain PostgreSQL's MVCC in depth: xmin, xmax, and transaction snapshots.

Each row has hidden columns xmin (creating transaction) and xmax (deleting/updating transaction). A transaction sees rows where xmin is committed before its snapshot and xmax is not committed or later. This provides isolation without read locks.

Q32How do you write a PL/pgSQL function with loops and conditionals for a business process?

CREATE FUNCTION apply_bonus(dept_id INT, pct DECIMAL) RETURNS void AS $$
DECLARE emp RECORD;
BEGIN
    FOR emp IN SELECT id, salary FROM employees WHERE department_id = dept_id LOOP
        UPDATE employees SET salary = salary * (1 + pct/100) WHERE id = emp.id;
    END LOOP;
END; $$ LANGUAGE plpgsql;

Q33What are window functions? Use RANK() and SUM() OVER for a sales dashboard.

SELECT sales_rep, revenue,
   RANK() OVER (ORDER BY revenue DESC) as rank,
   SUM(revenue) OVER () as total_revenue
FROM quarterly_sales;

Q34How do you use JSONB functions and operators? Indexing JSONB.

SELECT data->>'name' AS name FROM users WHERE data @> '{"vip": true}';
CREATE INDEX idx_gin ON users USING GIN (data jsonb_path_ops);

Q35Explain recursive CTEs for hierarchical data (org chart).

WITH RECURSIVE org AS (
   SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
   UNION ALL
   SELECT e.id, e.name, e.manager_id FROM employees e JOIN org o ON e.manager_id = o.id
) SELECT * FROM org;

Q36What are triggers and trigger functions? Create an audit trail.

CREATE FUNCTION audit_func() RETURNS trigger AS $$
BEGIN
    INSERT INTO audit_log VALUES (OLD.*, NOW());
    RETURN NEW;
END; $$ LANGUAGE plpgsql;
CREATE TRIGGER audit_emp AFTER UPDATE ON employees FOR EACH ROW EXECUTE FUNCTION audit_func();

Q37How do you use LISTAGG equivalent (STRING_AGG) in PostgreSQL?

SELECT department, STRING_AGG(name, ', ' ORDER BY name) FROM employees GROUP BY department;

Q38Explain partial indexes and covering indexes with INCLUDE.

CREATE INDEX idx_active ON orders (order_date) WHERE status = 'active';
CREATE INDEX idx_cover ON orders (customer_id) INCLUDE (amount, order_date);

Q39What is the difference between INNER JOIN and WHERE EXISTS?

EXISTS stops at first match, often faster for existence checks. JOIN returns duplicate rows if many matches. Use EXISTS for semi-joins, JOIN when you need columns from both sides.

Q40How do you use table partitioning? Declarative partitioning example.

CREATE TABLE orders (
   order_id INT, amount NUMERIC, order_date DATE
) PARTITION BY RANGE (order_date);
CREATE TABLE orders_2026q1 PARTITION OF orders FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');

Q41Explain foreign data wrappers (FDW) for querying external PostgreSQL or other databases.

CREATE EXTENSION postgres_fdw;
CREATE SERVER remote_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '10.0.0.5', dbname 'remote');
CREATE USER MAPPING FOR local_user SERVER remote_server OPTIONS (user 'remote_user', password 'secret');
CREATE FOREIGN TABLE remote_orders (...) SERVER remote_server OPTIONS (schema_name 'public', table_name 'orders');

Q42How do you implement pagination with keyset (cursor-based) pagination?

SELECT * FROM orders WHERE id > :last_id ORDER BY id LIMIT 20;

Q43What is a materialized view and when to use it?

CREATE MATERIALIZED VIEW monthly_sales AS SELECT ...;
REFRESH MATERIALIZED VIEW monthly_sales CONCURRENTLY;

Q44Explain array types and their operators in PostgreSQL.

SELECT tags[1] FROM articles WHERE 'postgres' = ANY(tags);

Q45How do you use the RETURNING clause in DML?

INSERT INTO users (name) VALUES ('John') RETURNING id;
UPDATE products SET price = price*1.1 WHERE id=1 RETURNING price;

Q46What are advisory locks and when are they useful?

SELECT pg_advisory_lock(123); -- application-level lock for mutual exclusion

Q47Explain the difference between SERIALIZABLE and READ COMMITTED isolation.

READ COMMITTED sees changes from concurrent committed transactions. SERIALIZABLE uses predicate locking to ensure true serial execution. Most apps use READ COMMITTED; SERIALIZABLE for financial consistency.

Q48How do you use window functions with PARTITION BY and ORDER BY for running totals?

SELECT date, amount, SUM(amount) OVER (PARTITION BY account_id ORDER BY date) FROM transactions;

Q49What are generated columns? Example with an expression.

ALTER TABLE employees ADD full_name TEXT GENERATED ALWAYS AS (first_name || ' ' || last_name) STORED;

Q50How do you use the COPY command programmatically with PL/pgSQL?

COPY (SELECT * FROM orders WHERE order_date = CURRENT_DATE) TO '/tmp/orders.csv' CSV HEADER;

Q51Explain the HOT update optimization in PostgreSQL.

Heap-Only Tuple updates avoid index bloat when the update does not change indexed columns. The new tuple stays in the same page and is linked from the old one. Reduces VACUUM work.

Q52What is a BRIN index and when is it effective?

Block Range INdex stores min/max per block range. Very small, ideal for large tables with natural ordering (e.g., time-series). Faster than B-tree for range scans on huge tables if correlation is high.

Q53How do you implement a queue using SKIP LOCKED?

WITH next AS (
   SELECT id FROM job_queue WHERE status='pending' ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED
) UPDATE job_queue SET status='processing' FROM next WHERE job_queue.id = next.id RETURNING *;

Q54Explain the difference between plpgsql and sql language functions.

SQL functions can be inlined by the optimizer, often faster for simple queries. PL/pgSQL adds procedural logic (loops, conditionals) but may have overhead. Choose SQL when possible.

Q55What are event triggers in PostgreSQL?

They fire on DDL commands (CREATE, ALTER, DROP). Useful for auditing schema changes or enforcing naming conventions. Example: `CREATE EVENT TRIGGER ... ON ddl_command_start EXECUTE FUNCTION ...`.

Q56How do you use the pg_stat_statements extension for query analysis?

CREATE EXTENSION pg_stat_statements;
SELECT query, calls, mean_time FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;

Q57Explain the difference between NOW() and transaction_timestamp().

NOW() returns the transaction start time (stable within transaction). clock_timestamp() returns current wall-clock time. Use NOW() for consistent auditing.

Q58What are the different types of table inheritance? When is it useful?

PostgreSQL supports table inheritance (child tables inherit columns). Mostly replaced by declarative partitioning, but still useful for certain partitioning patterns with custom rules.

Q59How do you use UPSERT (INSERT ... ON CONFLICT)?

INSERT INTO users (id, email) VALUES (1, 'a@b.com')
ON CONFLICT (id) DO UPDATE SET email = EXCLUDED.email;

Q60Explain the tsearch2 / full-text search capabilities with GIN indexes.

SELECT * FROM articles WHERE to_tsvector('english', body) @@ to_tsquery('postgres & interview');

🦅 PostgreSQL Developer — Expert 5-10 Yrs

▼

Q61Explain the PostgreSQL WAL (Write-Ahead Log) internals and how it supports replication and crash recovery.

All modifications are logged to WAL before data files. During recovery, WAL is replayed. For replication, WAL records are streamed to standbys. WAL level must be 'replica' or 'logical'. Key parameter: wal_level, max_wal_senders.

Q62How does logical replication work in PostgreSQL 10+? Set up a publication and subscription.

-- Publisher
CREATE PUBLICATION pub1 FOR TABLE users;
-- Subscriber
CREATE SUBSCRIPTION sub1 CONNECTION 'host=pub dbname=mydb' PUBLICATION pub1;

Q63What is a parallel query and how to tune it? When does it help?

PostgreSQL can use parallel workers for sequential scans, joins, aggregates. Set max_parallel_workers_per_gather. Useful for large datasets. Monitor with EXPLAIN (ANALYZE, VERBOSE).

Q64Explain the JIT compilation in PostgreSQL 11+ and when it benefits.

JIT compiles expressions to machine code. Helps CPU-bound queries with complex WHERE. Enable with jit=on. Not beneficial for simple queries; test before enabling globally.

Q65How do you diagnose and fix a "remaining connection slots are reserved" error?

Check max_connections and superuser_reserved_connections. Increase max_connections or use connection pooling (PgBouncer). Find idle connections with `SELECT * FROM pg_stat_activity WHERE state = 'idle';`

Q66What are the key performance tuning parameters in postgresql.conf?

shared_buffers (25% RAM), effective_cache_size (50-75% RAM), work_mem (per operation), maintenance_work_mem, wal_buffers, random_page_cost (1.1 for SSD), effective_io_concurrency.

Q67Explain the use of pg_prewarm and the buffer cache.

SELECT pg_prewarm('big_table'); -- loads table into OS cache or shared buffers

Q68How do you handle large-scale data loading efficiently?

Use COPY (not INSERT), drop indexes before load then recreate, increase maintenance_work_mem, disable triggers, use UNLOGGED tables if acceptable. Parallelize with pg_bulkload.

Q69What is the pg_cron extension and how does it replace cron for DB tasks?

SELECT cron.schedule('nightly-vacuum', '0 3 * * *', 'VACUUM ANALYZE');

Q70How do you design a multi-tenant architecture in PostgreSQL? Row-level security vs schema per tenant.

RLS is built-in: `CREATE POLICY ... USING (tenant_id = current_setting('app.tenant_id')::int)`. Schema-per-tenant provides isolation but management overhead. Choose based on tenant count and security needs.

Q71Explain the "count(*) slow" problem in PostgreSQL and how to mitigate it.

PostgreSQL does not cache row counts due to MVCC, always sequential scan. Use estimates via `pg_class.reltuples`, or use a trigger-maintained counter table. In 14+, `VACUUM` updates visibility map to allow index-only scans for count if no dead tuples.

Q72How do you use pgBouncer for connection pooling? Modes and best practices.

pgBouncer acts as a lightweight pooler. Modes: session, transaction, statement. Transaction pooling works best for web apps. Configure pool size, max_client_conn, and use `server_reset_query = DISCARD ALL`.

Q73What is the purpose of the visibility map and how does it improve VACUUM?

It tracks which pages contain only tuples visible to all transactions. VACUUM can skip those pages, and index-only scans can use it to avoid fetching heap tuples. Reduces I/O.

Q74Explain the difference between logical decoding and streaming replication.

Streaming replication ships WAL as is (physical). Logical decoding transforms WAL into a higher-level representation of changes, allowing selective replication, multi-version compatibility, and integration with other systems (Debezium).

Q75How do you use the pg_repack extension for online table repacking?

pg_repack -t bloated_table -d mydb

Q76What are the benefits of using CHECK constraints over triggers for data validation?

CHECK constraints are declarative, faster, and enforced at the engine level. They are evaluated before row insertion, making them more efficient than trigger-based validation.

Q77How do you implement a real-time analytics pipeline with PostgreSQL and Debezium?

Debezium captures changes from PostgreSQL WAL and streams to Kafka. Consumers (materialized views, caches) are updated in real time. Excellent for CDC without polling.

Q78Explain the concept of "synchronous commit" and its performance impact.

Synchronous commit waits for WAL to be flushed to standby before acknowledging the client. Ensures zero data loss but adds latency. Use only for critical transactions; others use asynchronous.

Q79How do you monitor replication lag? Which views to use?

SELECT application_name, pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS lag_bytes FROM pg_stat_replication;

Q80What is the Patroni tool and how does it manage HA?

Patroni is a template for high availability using etcd/Consul/ZooKeeper for leader election and configuration. It automates failover and replication management. Often used with pgBouncer and HAProxy.

Q81How do you use the pg_stat_activity to find blocking queries?

SELECT pid, query, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event IS NOT NULL;

Q82Explain the term "transaction ID wraparound" and how to prevent it.

PostgreSQL uses 32-bit transaction IDs. When they approach max, it can cause data loss. VACUUM (especially aggressive) freezes old tuples, preventing wraparound. Monitor age: `SELECT max(age(datfrozenxid)) FROM pg_database;`

Q83How do you implement a "soft delete" with PostgreSQL and still maintain performance?

Add `deleted_at TIMESTAMP` and index it. Use partial index `WHERE deleted_at IS NULL` for active queries. Query with `WHERE deleted_at IS NULL`. Consider partitioning by deleted status.

Q84What are the benefits of using DOMAINs to enforce business rules?

CREATE DOMAIN email_address AS TEXT CHECK (VALUE ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');

Q85How do you use pg_hba.conf to control access securely?

# TYPE  DATABASE  USER  ADDRESS          METHOD
host    all       all   192.168.1.0/24   scram-sha-256

🐉 PostgreSQL Developer — Most Expert 10+ Yrs

▼

Q86Design a scalable PostgreSQL cluster for 1M queries per second with global distribution.

Use sharding via Citus or built-in partitioning + FDW. Combine logical replication for cross-region. Use connection pooling (PgBouncer) and read replicas. pgvector for AI search, and PostgresML for in-database ML.

Q87Explain how to build a real-time recommendation system using pgvector and PL/pgSQL.

CREATE EXTENSION vector;
CREATE TABLE products (id SERIAL, embedding vector(512));
SELECT * FROM products ORDER BY embedding <-> '[...]' LIMIT 10;

Q88How do you use PostgresML to train and deploy models directly in the database?

SELECT pgml.train('churn_model', 'classification', 'churn_data', 'is_churn');
SELECT pgml.predict('churn_model', ARRAY[age, income, ...]);

Q89What is the "pg_squeeze" extension and how does it combat bloat automatically?

pg_squeeze reorganizes tables online without blocking, similar to pg_repack but scheduled. It's useful for high-write tables where autovacuum can't keep up.

Q90How do you implement a multi-master solution with PostgreSQL? (BDR, pglogical)

BDR (Bi-Directional Replication) allows writes on multiple nodes. Conflict resolution is application-driven. pglogical provides logical replication with conflict handling. Complex but useful for distributed apps.

Q91Explain how PostgreSQL handles full-text search for multiple languages and ranking.

SELECT ts_rank(to_tsvector('english', body), query) AS rank, * FROM articles ...

Q92What is the "TimescaleDB" extension and how does it make PostgreSQL a time-series database?

TimescaleDB adds automatic partitioning by time (hypertables), compression, and continuous aggregates. Ideal for IoT, monitoring, and financial tick data.

Q93How do you secure PostgreSQL with SSL, LDAP, and row-level security in a HIPAA-compliant system?

Enable SSL, use certificate authentication. Integrate LDAP for user management. Implement RLS policies for patient data isolation. Encrypt columns with pgcrypto. Audit with pgAudit.

Q94Explain the "Zheap" storage engine and its potential to replace the current heap.

Zheap (work in progress) aims to eliminate bloat by avoiding the need for VACUUM. It uses undo logs like Oracle's InnoDB. Currently experimental.

Q95How do you use PL/Java or PL/Python to extend PostgreSQL with AI libraries?

CREATE FUNCTION call_ml(text) RETURNS float AS $$
   import some_ml_lib
   return some_ml_lib.predict(text)
$$ LANGUAGE plpython3u;

Q96What are common pitfalls when migrating from Oracle to PostgreSQL? (sequences, packages)

Oracle's packages become schemas with functions. Sequences need adjustment. PL/SQL to PL/pgSQL: differences in exception handling, autonomous transactions (use dblink), and performance.

Q97Explain the concept of "incremental materialized view maintenance" with pg_ivm.

pg_ivm (extension) allows materialized views to be updated incrementally as base tables change, avoiding full refresh. Great for real-time dashboards.

Q98How do you perform a zero-downtime major version upgrade of PostgreSQL?

Use logical replication: set up new version replica, replicate, switch application traffic, then promote. Or use pg_upgrade with link mode for fast in-place upgrade (requires downtime, but minutes).

Q99What emerging PostgreSQL features excite you for 2026?

Active-active replication improvements, better parallel query, AI integration with pgvector and PostgresML, autonomous tuning, and cloud-native storage engines.

Q100How do you keep your PostgreSQL skills updated?

Follow official blogs, Planet PostgreSQL, attend PGConf, and experiment with extensions and beta versions. Contribute to community tools.

🗄️ PostgreSQL DBA — Beginner

▼

Q101How do you install PostgreSQL on Ubuntu and initialize the cluster?

sudo apt install postgresql postgresql-contrib
sudo pg_ctlcluster 14 main start

Q102What are the key directories in a PostgreSQL data directory?

pg_wal, pg_stat, base (tablespaces), global, pg_logical, and configuration files (postgresql.conf, pg_hba.conf).

Q103How do you create a tablespace and why use it?

CREATE TABLESPACE fast_ssd LOCATION '/mnt/ssd/pgdata';

Q104Explain WAL archiving for point-in-time recovery.

Archive mode copies WAL files to a safe location. Enable archive_mode and set archive_command. Combined with base backup, allows PITR to any point.

Q105How do you backup with pg_basebackup?

pg_basebackup -D /backup/base -Ft -z -P

Q106What is the autovacuum daemon and how to tune it?

Autovacuum reclaims storage and updates statistics. Tune with autovacuum_scale_factor and naptime. Monitor in pg_stat_user_tables.

Q107How do you check database size and table sizes?

SELECT pg_database_size('mydb');
SELECT pg_total_relation_size('table_name');

Q108What is a role and how to grant privileges?

CREATE ROLE analyst LOGIN PASSWORD 'secret';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analyst;

Q109Explain the pg_stat_activity view for monitoring sessions.

SELECT pid, usename, state, query FROM pg_stat_activity;

Q110How do you kill a long-running query?

SELECT pg_terminate_backend(pid);

Q111What is the purpose of the statistics collector?

It tracks database activity (table/index scans, rows, etc.) for the query planner. Data stored in pg_stat_* views. Ensures planner uses accurate stats.

Q112How do you enable slow query logging?

log_min_duration_statement = 1000  # log queries >1s

Q113What is the checkpoint process and how does it affect performance?

Checkpoint writes dirty pages to disk. Frequent checkpoints reduce recovery time but increase I/O. Tune with checkpoint_timeout and max_wal_size.

Q114How do you use pg_dump vs pg_dumpall?

pg_dump for single database; pg_dumpall for entire cluster including roles and tablespaces.

Q115What is the pg_stat_statements extension and how to set it up?

CREATE EXTENSION pg_stat_statements;

Q116How do you create a read-only user?

CREATE USER readonly WITH PASSWORD 'pass';
GRANT CONNECT ON DATABASE mydb TO readonly;
GRANT USAGE ON SCHEMA public TO readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO readonly;

Q117What is the pg_hba.conf file and how does it control authentication?

Host-Based Authentication file defines which users can connect from where using which method (password, md5, scram-sha-256, peer).

Q118How do you restart PostgreSQL without downtime? (pg_ctl reload)

`pg_ctl reload` sends SIGHUP to reload configuration files without restart. Some changes require a full restart.

Q119Explain the difference between physical and logical replication.

Physical: ships WAL, byte-for-byte copy. Logical: decodes WAL into row changes, allows selective replication, can cross versions.

Q120How do you monitor disk space usage?

SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) FROM pg_database;

🗄️ PostgreSQL DBA — Intermediate

▼

Q121How do you set up streaming replication with a hot standby?

# On primary
CREATE USER replicator REPLICATION LOGIN CONNECTION LIMIT 5 ENCRYPTED PASSWORD 'secret';
# On standby: pg_basebackup, configure recovery.conf / standby.signal, start PostgreSQL.

Q122What is the difference between synchronous and asynchronous streaming replication?

Synchronous waits for standby to acknowledge WAL flush before committing; ensures zero data loss but adds latency. Asynchronous is faster but may lose last transactions on failover.

Q123How do you promote a standby to primary?

pg_ctl promote -D /var/lib/postgresql/data

Q124Explain the use of pg_rewind to re-sync a former primary.

If a standby was promoted and you want to bring back the old primary as a standby, pg_rewind synchronizes the data directories by copying changed blocks, avoiding a full resync.

Q125What is the `max_wal_senders` parameter and how to size it?

Limits concurrent WAL sender processes for replication. Set to number of standbys + some buffer for backups. Default 10, increase for multiple replicas.

Q126How do you use replication slots to prevent WAL removal?

SELECT * FROM pg_create_physical_replication_slot('standby_slot');

Slots ensure WAL is retained until consumed by standbys, but can cause disk full if standbys lag.

Q127How do you configure PgBouncer for transaction pooling?

[databases]
mydb = host=127.0.0.1 port=5432
[pgbouncer]
pool_mode = transaction
listen_port = 6432

Q128What are the best practices for VACUUM on a high-write OLTP system?

Tune autovacuum to be more aggressive (reduce scale factor). Monitor dead tuples. Schedule VACUUM FREEZE periodically. Use pg_repack for severe bloat.

Q129How do you perform a major version upgrade using pg_upgrade?

/usr/lib/postgresql/15/bin/pg_upgrade \
  --old-datadir /var/lib/postgresql/14/data \
  --new-datadir /var/lib/postgresql/15/data \
  --old-bindir /usr/lib/postgresql/14/bin \
  --new-bindir /usr/lib/postgresql/15/bin

Q130Explain the use of `pg_stat_user_tables` to find bloat.

SELECT schemaname, relname, n_dead_tup, n_live_tup, last_vacuum FROM pg_stat_user_tables;

🗄️ PostgreSQL DBA — Expert

▼

Q146How do you design a disaster recovery plan with cross-region replication?

Use streaming replication to a different region with asynchronous mode. Combine with WAL archiving to S3 for point-in-time recovery. Test failover regularly.

Q147What is the "full_page_writes" parameter and how does it affect performance?

Ensures entire page is written to WAL after checkpoint; prevents partial page corruption but increases WAL volume. Keep enabled unless using battery-backed cache or ZFS.

Q148Explain the differences between PostgreSQL on Kubernetes and traditional VMs.

Kubernetes offers dynamic scaling and self-healing, but requires careful storage and network setup. Use Crunchy Data or Zalando operators. Performance overhead is minimal with proper configuration.

Q149How do you use the pgBackRest tool for advanced backup and restore?

pgbackrest --stanza=main --type=full backup
pgbackrest --stanza=main --type=diff backup

Q150What are the signs of transaction ID wraparound risk and how to prevent it?

Check `age(datfrozenxid)` > 1 billion. Run aggressive VACUUM FREEZE. Ensure autovacuum is working. Set `vacuum_freeze_min_age` appropriately.

☁️ Cloud & Advanced

▼

Q171What are the benefits of Amazon Aurora PostgreSQL vs standard RDS PostgreSQL?

Aurora offers faster storage, up to 15 read replicas, automatic scaling, and global database. RDS is standard PostgreSQL with managed ops. Aurora is more expensive but higher performance.

Q172How do you run PostgreSQL on Kubernetes with the Zalando operator?

Deploy the operator, create a PostgreSQL CRD. It manages HA, backups, and cloning. Uses Patroni under the hood.

Q173Explain the use of `pg_tle` (Trusted Language Extensions) on AWS RDS.

Allows safe deployment of PL extensions without full superuser. Enables custom functions in restricted environments.

Q174What is Supabase and how does it wrap PostgreSQL?

Supabase provides a real-time API, authentication, and storage on top of PostgreSQL. It uses PostgREST and row-level security heavily. Great for rapid app development.

Q175How do you set up a hybrid cloud PostgreSQL with on-prem and cloud read replicas?

Use logical replication from on-prem to cloud. Cloud replica can serve analytics. Ensure secure connection (VPN/SSL).

🚚 Migration & Upgradation

▼

Q191How do you migrate from Oracle to PostgreSQL? Key challenges.

Use Ora2Pg for schema conversion. Handle sequences, PL/SQL to PL/pgSQL, data types (NUMBER to NUMERIC). Test extensively. Use foreign data wrappers for incremental migration.

Q192How do you perform a zero-downtime migration from MySQL to PostgreSQL?

Use pgloader for initial load. Set up logical replication from MySQL (via Debezium) to PostgreSQL, then switch over after validation.

Q193What is the pgloader tool and how to use it?

pgloader mysql://user@localhost/sourcedb postgresql://user@localhost/targetdb

🤖 AI/ML with PostgreSQL

▼

Q211How do you use pgvector for semantic search? Example with embeddings.

SELECT * FROM documents ORDER BY embedding <=> query_embedding LIMIT 5;

Q212Explain how to build a RAG system with PostgreSQL, pgvector, and an external LLM API.

Store document chunks with embeddings. On query, retrieve relevant chunks via vector similarity, construct a prompt with context, call LLM API via PL/Python or external service, return answer.

Q213What is PostgresML and how does it enable in-database ML training and inference?

It integrates popular ML libraries (XGBoost, Scikit-learn) into SQL functions. `SELECT pgml.train(...)`. No data movement needed.

Q214How can you call an external AI API from a PostgreSQL function using PL/Python?

CREATE FUNCTION get_embedding(text) RETURNS vector AS $$
   import openai
   return openai.Embedding.create(input=text, model='text-embedding-3-small')['data'][0]['embedding']
$$ LANGUAGE plpython3u;

Q215What are the use cases for `pg_similarity` extension?

String similarity, fuzzy matching, and deduplication using Jaccard, Cosine, etc. Useful for data quality tasks.

🎭 Scenarios

▼

S1"Slow query after PostgreSQL upgrade" — how to diagnose.

Check execution plan with EXPLAIN ANALYZE. Compare plan before/after. Likely caused by statistics or planner changes. Use `pg_stat_statements` to find regression. Fix with ANALYZE or hints.

S2"Unexpected autovacuum causing IO spikes" — tuning steps.

Set autovacuum_vacuum_cost_limit higher, adjust cost delay, and maybe throttle during business hours using cron to disable/enable.

S3"Replication lag on standby" — troubleshooting.

Check network, WAL generation rate, apply latency. Increase max_wal_senders, use replication slots, or upgrade standby hardware.

S4"Disk full due to WAL" — emergency resolution.

Archive or delete old WAL (if no replication needs), temporarily increase disk, or set up monitoring. `pg_archivecleanup` can help.

S5"Row-level security misconfiguration exposed data."

Audit policies with `pg_policies`. Ensure default deny, test with `SET ROLE`. Implement comprehensive test suite.

S6"Connections exhausted during Black Friday."

Increase max_connections cautiously. Implement PgBouncer. Reduce idle timeouts. Scale out with read replicas.

S7"Migration from SQL Server — date/time issues."

Convert DATETIME2 to TIMESTAMP, handle time zone. Use pgloader transformation rules.

S8"Slow full-text search on large corpus."

Use GIN index, tune `gin_fuzzy_search_limit`, or consider external search engines like Elasticsearch if beyond PostgreSQL's scope.

S9"Corrupted data page — how to recover?"

Use `pg_resetwal` as last resort; restore from backup. Enable data checksums to detect early.

S10"Planner chooses sequential scan when index should be used."

Check random_page_cost (lower for SSD). Check statistics; maybe increase statistics target. Use `SET enable_seqscan = off` temporarily to test.

🧪 Hands-On Labs

▼

L1Install PostgreSQL 16, create a database, and set up streaming replication.

Follow step-by-step in official docs or use Docker: `docker run --name pg1 -e POSTGRES_PASSWORD=pass -d postgres:16` and configure replication.

L2Create a PL/pgSQL function that calculates compound interest with input validation.

CREATE FUNCTION compound_interest(principal NUMERIC, rate NUMERIC, years INT) RETURNS NUMERIC ...

L3Set up pgvector and perform similarity search on a set of product descriptions.

Generate embeddings using a Python script, load into table, query with `<=>`.

L4Implement row-level security for a multi-tenant SaaS schema.

ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON projects USING (tenant_id = current_setting('app.current_tenant')::int);

L5Use pgloader to migrate a MySQL database to PostgreSQL.

pgloader mysql://root@localhost/mydb postgresql://user@localhost/mydb

💻 Code Exercises

▼

E1Write a query to find the top 3 products by revenue each month using window functions.

SELECT month, product, revenue FROM (
   SELECT date_trunc('month', sale_date) as month, product, SUM(amount) revenue,
   RANK() OVER (PARTITION BY date_trunc('month', sale_date) ORDER BY SUM(amount) DESC) rnk
   FROM sales GROUP BY 1,2
) sub WHERE rnk <= 3;

E2Create a trigger that prevents deletion of employees with active projects.

CREATE FUNCTION check_projects() RETURNS trigger AS $$
BEGIN
   IF EXISTS (SELECT 1 FROM projects WHERE manager_id = OLD.id AND status = 'active') THEN
       RAISE EXCEPTION 'Cannot delete employee with active projects';
   END IF;
   RETURN OLD;
END; $$ LANGUAGE plpgsql;

⚡ Performance Tuning Deep Dive

▼

Q226How do you read an EXPLAIN ANALYZE output? Key metrics.

Look at actual time vs planned, rows vs estimated, loops. High startup or total time indicates bottleneck. Buffers show I/O. Use `FORMAT JSON` for tools.

Q227What is `work_mem` and how does it affect sorts and hashes?

It's memory per operation. Too low causes disk spills (slow). Too high can cause OOM. Increase for large analytical queries, but set per session if needed.

Q228Explain the use of `pg_stat_statements` to find the top slow queries.

SELECT query, calls, mean_time, total_time FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;

Q229How do you use `auto_explain` to log execution plans automatically?

LOAD 'auto_explain';
SET auto_explain.log_min_duration = '500ms';
SET auto_explain.log_analyze = on;

Q230What is `pg_buffercache` and how does it help tuning?

SELECT c.relname, count(*) AS buffers FROM pg_buffercache b JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c.oid) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10;

Q231Explain the role of `effective_cache_size` in the planner.

It estimates how much memory is available for caching (OS + shared buffers). Planner uses it to decide between index scans and sequential scans. Set to ~75% of total RAM.

Q232What is `jit_above_cost` and when to adjust it?

If query cost exceeds this, JIT compilation is considered. Increase if JIT overhead is too high for simple queries, decrease to benefit complex ones.

Q233How do you use `pg_prewarm` to improve cache after restart?

SELECT pg_prewarm('important_table');

Q234Explain the impact of `random_page_cost` on SSD vs HDD.

Default 4.0 assumes HDD. For SSD, set to 1.0-1.5 to reflect that random access is nearly as fast as sequential. This encourages index use.

Q235What is the `max_parallel_workers_per_gather` parameter?

Limits parallel workers per gather node. Increase for large tables. Set based on CPU cores. Too high can oversubscribe CPU.

Q236How do you analyze locking with `pg_locks`?

SELECT l.pid, l.mode, l.granted, a.query FROM pg_locks l JOIN pg_stat_activity a ON l.pid = a.pid;

Q237What are common causes of "out of shared memory" errors?

Too high `shared_buffers`, large number of connections, or huge `work_mem` settings. Reduce or adjust kernel parameters.

Q238How do you set up monitoring with Prometheus and postgres_exporter?

./postgres_exporter --web.listen-address=":9187"

Scrape metrics into Prometheus, visualize with Grafana.

Q239Explain the use of `pg_stat_reset` and when it's appropriate.

Resets statistics counters. Use after major changes to see fresh baselines. Not for routine monitoring.

Q240What is the `temp_file_limit` parameter?

Limits temporary file usage per session (for sorts/hashes). Prevents runaway queries from filling disk.

Q241How do you use `checkpoint_completion_target` to smooth I/O?

Set to 0.7-0.9 to spread checkpoint writes over time, reducing spikes. Higher value means longer checkpoints but smoother.

Q242What is `wal_compression` and when to enable it?

Compresses WAL records, reducing I/O and network traffic for replication. CPU overhead moderate. Enable if WAL writes are bottleneck.

Q243How to troubleshoot a CPU-bound PostgreSQL instance?

Check `top`, `pg_stat_activity`. Look for high CPU queries with `pg_stat_statements`. Tune queries, add indexes, or scale up.

Q244What is the impact of `synchronous_commit = off` on performance?

It reduces commit latency because it doesn't wait for WAL flush. Risk of losing a few committed transactions on crash. Acceptable for many analytics workloads.

Q245Explain the `pgstattuple` extension to analyze table bloat.

SELECT * FROM pgstattuple('table_name');

Q246How do you use connection pooling to handle thousands of connections?

Deploy PgBouncer in transaction pooling mode. Adjust default_pool_size. This multiplexes client connections into a smaller number of DB connections.

Q247What is the difference between `pg_stat_database` and `pg_stat_bgwriter`?

Database-level stats (commits, rollbacks, blocks) vs background writer efficiency (buffers cleaned, checkpoints).

Q248How do you implement connection failover with HAProxy and Patroni?

HAProxy checks Patroni's health endpoint to route traffic to the primary. Patroni manages the PostgreSQL cluster. This provides automatic failover for applications.

Q249What are the best practices for configuring `max_connections`?

Set to a reasonable number (e.g., 200-500) and use connection pooler. Too many connections cause context switching and memory pressure. Monitor `pg_stat_activity` for idle connections.

Q250Final advice: How to become a PostgreSQL performance expert?

Master EXPLAIN, understand the planner, learn internal architecture (WAL, MVCC, buffer management), and practice with real-world workloads. Contribute to the community and stay curious.

Mominul's Blog

Latest

Home Top Ad

Thursday, July 2, 2026

🐘 The Ultimate PostgreSQL Interview Mastery Guide

🐣 PostgreSQL Developer — Beginner 0-2 Yrs

🐥 PostgreSQL Developer — Intermediate 2-5 Yrs

🦅 PostgreSQL Developer — Expert 5-10 Yrs

🐉 PostgreSQL Developer — Most Expert 10+ Yrs

🗄️ PostgreSQL DBA — Beginner

🗄️ PostgreSQL DBA — Intermediate

🗄️ PostgreSQL DBA — Expert

☁️ Cloud & Advanced

🚚 Migration & Upgradation

🤖 AI/ML with PostgreSQL

🎭 Scenarios

🧪 Hands-On Labs

💻 Code Exercises

⚡ Performance Tuning Deep Dive

No comments:

Post a Comment

Author Details

Subscribe

Translate

Pageviews last month

Recent

Popular

Comments

Notice

⚠️ Copyright & Content Notice

📧 Contact Us

Archive

Sponsor

Learning

Tags

Search This Blog

Contact Form