zaterdag 28 augustus 2021

Diffractor Photo - Nieuwe foto- en videobeheerder

Heb je vele duizenden foto's op je pc, dan is het nuttig om daar enige organisatie in aan te brengen door ze bijvoorbeeld van trefwoorden en waarderingen te voorzien. Je hebt daarvoor de keus uit vele producten, variërend van gratis programmaatjes als XnView of IrfanView tot commerciële zwaargewichten als Adobe Bridge. We vonden echter een nieuwe naam in dit toch al riante aanbod, en wel Diffractor Photo. Hiermee organiseer en bekijk je collecties foto's en video's en kun je er ook eenvoudige bewerkingen op loslaten. Het enige nadeel is dat de gratis versie alleen standaardmappen automatisch verwerkt.

Diffractor Photo vind je hier.

New article matched for in your Blogger! rule

Testing Wide Production 2020

donderdag 19 augustus 2021

Franck Pachot: pgbench --client --jobs

797140.png

In this post I'll explain, with examples, the following options of pgbench: --client and --jobs used when running concurrent activity. Because their name is misleading: --client is about the number of servers and --jobs about the number of clients 🤨

I'm using the short options in the examples below, here are the equivalents:

    pgbench --help | grep -E -- " -[stTcjfn],"          -n, --no-vacuum          do not run VACUUM during initialization      -s, --scale=NUM          scaling factor      -f, --file=FILENAME[@W]  add script FILENAME weighted at W (default: 1)      -c, --client=NUM         number of concurrent database clients (default: 1)      -j, --jobs=NUM           number of threads (default: 1)      -n, --no-vacuum          do not run VACUUM before tests      -s, --scale=NUM          report this scale factor in output      -t, --transactions=NUM   number of transactions each client runs (default: 10)      -T, --time=NUM           duration of benchmark test in seconds        

In order to run something simple and predictable, I'll use a custom script which simply waits one second in database: select pg_sleep(1) and because I like one-liners I pass it though STDIN:

    pgbench -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"        transaction type: /dev/stdin    scaling factor: 1    query mode: simple    number of clients: 1    number of threads: 1    number of transactions per client: 10    number of transactions actually processed: 10/10    latency average = 1005.139 ms    tps = 0.994887 (including connections establishing)    tps = 0.995271 (excluding connections establishing)    

I've set it to run 30 seconds and, without surprises, it has run with 1 transaction per second given that I have 1 thread running 10 transactions through 1 client connection.
Those are the defaults -j 1 -c 1. I'll run with different values.

-c --client number of concurrent database clients (default: 1)

This is the most important to control the load on the database. Each client is a connection to the DB, which means a backend process, and transactions are executed concurrently. Let's run the same as above, now with 2 clients:

pgbench -j 1 -c 2 -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"        transaction type: /dev/stdin    scaling factor: 1    query mode: simple    number of clients: 2    number of threads: 1    duration: 30 s    number of transactions actually processed: 60    latency average = 1004.676 ms    tps = 1.990692 (including connections establishing)    tps = 1.990951 (excluding connections establishing)    

I can achieve 2 transactions per second now, still with the 1 second latency query. Increasing the number of clients will linearly increase the throughput if there are no bottlenecks elsewhere. This is where we say "it scales".

Because a sleep(1) doesn't take lot of resources, I still have the same latency with 100 connections:

pgbench -j 1 -c 100 -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"        transaction type: /dev/stdin    scaling factor: 1    query mode: simple    number of clients: 100    number of threads: 1    duration: 30 s    number of transactions actually processed: 2900    latency average = 1056.748 ms    tps = 94.629933 (including connections establishing)    tps = 94.652406 (excluding connections establishing)    

Asynchronous libq

But you can see that I still have 1 thread (--jobs=1) default here. How can I run through 100 connections (aka server threads aka --client) concurrently running transactions from 1 client thread (aka --jobs)?

Here is a trace of interesting system calls about the communication with the database, with 3 clients from 1 thread:

strace -T -s 1000 -e trace=sendto,recvfrom,pselect6 -yy -o /dev/stdout pgbench -j 1 -c 3 -t 1 -nf /dev/stdin <<< "select pg_sleep(1)" | grep 5432        sendto(3<TCPv6:[[::1]:41360->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000116>    recvfrom(3<TCPv6:[[::1]:41360->[::1]:5432]>, 0xaaac8fb02dd0, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000008>    sendto(4<TCPv6:[[::1]:41362->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000037>    recvfrom(4<TCPv6:[[::1]:41362->[::1]:5432]>, 0xaaac8fb11750, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>    sendto(5<TCPv6:[[::1]:41364->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000026>    recvfrom(5<TCPv6:[[::1]:41364->[::1]:5432]>, 0xaaac8fb1c180, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000005>        pselect6(6, [3<TCPv6:[[::1]:41360->[::1]:5432]> 4<TCPv6:[[::1]:41362->[::1]:5432]> 5<TCPv6:[[::1]:41364->[::1]:5432]>], NULL, NULL, NULL, NULL) = 2 (in [4 5]) <1.001285>        recvfrom(4<TCPv6:[[::1]:41362->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0\0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000009>    sendto(4<TCPv6:[[::1]:41362->[::1]:5432]>, "X\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) = 5 <0.000327>    recvfrom(5<TCPv6:[[::1]:41364->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0\0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000009>    sendto(5<TCPv6:[[::1]:41364->[::1]:5432]>, "X\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) = 5 <0.000029>    pselect6(4, [3<TCPv6:[[::1]:41360->[::1]:5432]>], NULL, NULL, NULL, NULL) = 1 (in [3]) <0.000009>    recvfrom(3<TCPv6:[[::1]:41360->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0\0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000014>    sendto(3<TCPv6:[[::1]:41360->[::1]:5432]>, "X\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) = 5 <0.000035>    

I can clearly see 3 calls sendto(...->...5432...Q...select pg_sleep(1) returning immediately. Then pselect6(...:5432...:5432...:5432)...<1.001285> waiting for the first response from one of them, which takes 1 second. And then receiving the results with recvfrom(...) from each one.

Those are asynchronous calls and I know many developers expecting that for years in other databases.

If I add -k to strace I can get the call stack:

sendto(5<TCPv6:[[::1]:42120->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000031>     > /usr/lib64/libpthread-2.28.so(__send+0x34) [0x11a2c]     > /usr/lib64/libpq.so.5.13(pqsecure_raw_write+0x6f) [0x1f52f]     > /usr/lib64/libpq.so.5.13(pqSendSome+0x77) [0x19547]     > /usr/lib64/libpq.so.5.13(PQsendQuery+0x7b) [0x14f3b]     > /usr/bin/pgbench(threadRun+0x12e7) [0x84b7]     > /usr/bin/pgbench(main+0x16a7) [0x4867]     > /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > No DWARF information found    recvfrom(5<TCPv6:[[::1]:42120->[::1]:5432]>, 0xaaadd068c180, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000004>     > /usr/lib64/libpthread-2.28.so(recv+0x34) [0x11864]     > /usr/lib64/libpq.so.5.13(pqsecure_raw_read+0x3b) [0x1f233]     > /usr/lib64/libpq.so.5.13(pqReadData+0xab) [0x192fb]     > /usr/lib64/libpq.so.5.13(PQconsumeInput+0x23) [0x1536b]     > /usr/bin/pgbench(threadRun+0x853) [0x7a23]     > /usr/bin/pgbench(main+0x16a7) [0x4867]     > /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > No DWARF information found    pselect6(6, [3<TCPv6:[[::1]:42116->[::1]:5432]> 4<TCPv6:[[::1]:42118->[::1]:5432]> 5<TCPv6:[[::1]:42120->[::1]:5432]>], NULL, NULL, NULL, NULL    ) = 1 (in [3]) <1.000339>     > /usr/lib64/libc-2.28.so(__select+0x74) [0xcaa5c]     > /usr/bin/pgbench(threadRun+0x14c3) [0x8693]     > /usr/bin/pgbench(main+0x16a7) [0x4867]     > /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > No DWARF information found    recvfrom(3<TCPv6:[[::1]:42116->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0    \0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000042>     > /usr/lib64/libpthread-2.28.so(recv+0x34) [0x11864]     > /usr/lib64/libpq.so.5.13(pqsecure_raw_read+0x3b) [0x1f233]     > /usr/lib64/libpq.so.5.13(pqReadData+0xab) [0x192fb]     > /usr/lib64/libpq.so.5.13(PQconsumeInput+0x23) [0x1536b]     > /usr/bin/pgbench(threadRun+0x853) [0x7a23]     > /usr/bin/pgbench(main+0x16a7) [0x4867]     > /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > /usr/bin/pgbench(_start+0x33) [0x5653]     > No DWARF information found    

The libpq C library used by pgbench has an asynchronous API with PQsendQuery/pqReadData

-j --jobs number of threads (default: 1)

They why running multiple threads on the client? You probably don't need to as one thread can handle hundred of asynchronous calls.

First, the threads cannot share the connections so you cannot have more client threads than server connections:

pgbench -j 2 -c 1 -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"        transaction type: /dev/stdin    scaling factor: 1    query mode: simple    number of clients: 1    number of threads: 1    duration: 30 s    number of transactions actually processed: 30    latency average = 1005.422 ms    tps = 0.994607 (including connections establishing)    tps = 0.994731 (excluding connections establishing)    

This has just ignored the --jobs to set it to the same as --client (you see that in "number of threads: 1"). Actually, the connections defined by --client are distributed among the threads defined by --jobs and it makes no sense to have threads with no connections. However you can have many connections per threads as we have seen below. This will still stress the database with concurrent executions thanks to asynchronous calls.

So what's the point with --jobs? My example was running a script that takes long in the database (1 second) when compared to the client work and that's why one client thread --jobs=1 can serve many connections --client=100 without being the bottleneck. However, if you run really short queries to many connections, the work on client side can be significant. And as the goal of pgbench is to stress the database, you may need more threads. Don't forget that if you want to stress the CPU you will probably not run pgbench on the database server. But then you need more connection because there's a network component in the latency.

I'm taking an extreme example here where my custom script doesn't even call the database but takes 1 second of client time:

pgbench -j 1 -c 100 -T 30 -nf /dev/stdin <<< "\shell sleep 1"    transaction type: /dev/stdin        scaling factor: 1    query mode: simple    number of clients: 100    number of threads: 1    duration: 30 s    number of transactions actually processed: 29    latency average = 446722.970 ms    tps = 0.223852 (including connections establishing)    tps = 0.223857 (excluding connections establishing)    

My unique thread throttles the throughput: During 30 seconds, only 30 transactions are possible on one thread when the client-side processing takes 1 second.

key points:

  • -c --client is what drives the number of sessions on the server
  • -j --jobs can be used if the coordination from pgbench is a bottleneck

I'll share more about pgbench. Because benchmarks means nothing if we don't understand exactly what is run and how. And pgbench, using libpq, with custom scripts, is great to show the different ways to run SQL efficiently. So I'm flagging this the first post of a series.

New article matched for in your Blogger! rule

Testing Wide Production 2020

vrijdag 13 augustus 2021

Postgres DBs in WSL2 not Showing in PGAdmin

Via PostgreSQL by /u/CommonMeaning

I just bought a new PC and I'm setting up my Postgres DB for my Rails apps (running Rails on localhost) to try to replicate what I had on my old PC.

I installed WSL2, Postgres, and PGAdmin (on Windows).

I know that Postgres is running because I can query the database in WSL2 through the Rails console (and can create and retrieve data). Also, when I am using WSL2 through Ubuntu on my Windows machine, if I try to create the DB in Rails, it says my development and test DBs already exist.

However, when I try to connect to the DB PGAdmin and DBeaver both do not see the DB.

Usually, when a DB is created in Rails for me, it automatically shows up when I refresh. With these, they don't show, and if I try to add a DB with the same name I created for my Rails app, it lets me (and there are no tables, so I know it's not the WSL2 DB Rails is using).

https://preview.redd.it/7c8agqfwsyg71.png?width=1227&format=png&auto=webp&s=b20ed33fe8edd7b8c84fed06829f4d47eb413de5

https://preview.redd.it/jiobo93ysyg71.png?width=1221&format=png&auto=webp&s=f150e1219e1a49ef6d89ab6f4a0379fd1193079d

I have also tried to use 127.0.0.1 for the host for the server, but it doesn't work.

I'm completely stumped. Any guidance would be appreciated. Thank you!

submitted by /u/CommonMeaning
[link] [comments]

New article matched for in your Blogger! rule

Testing Wide Production 2020

woensdag 11 augustus 2021

How the network can affect query performance

#​418 — August 11, 2021

Web Version

📋 I'm taking a rare vacation next week so Postgres Weekly will be taking a break too :-) I hope your summer (or winter, if you're in the southern hemisphere!) is going well and we'll be back on August 25.
__
Peter Cooper, Editor

Postgres Weekly

qvdzgew7zremkq8iegut.jpg

Grouping Sets with ROLLUP and CUBE — A grouping set is a set of columns used in a GROUP BY clause and can come in handy in OLAP queries. This post has some handy examples.

Hans-Jürgen Schönig

The Impact of the Network and Cursors on Query Performance — Network related overhead isn't often considered when we think about database or query performance, but it pays to be aware of it. Cursors are also taken into account as they can reduce initial latency on big result sets.

Jobin Augustine

18735702.jpg

CYBERTEC TDE: Transparent Data Encryption for PostgreSQL — TDE is a custom patch to PostgreSQL. It is currently the only implementation that fully supports transparent and cryptographically safe data (cluster) level encryption, independent of operating system or file system encryption. Download TDE here.

CYBERTEC sponsor

Modernizing a Familiar Approach to REST APIs, with Postgres and Cloudflare WorkersPostgREST is a tool that generates a REST API for Postgres databases and here we learn how Cloudflare Tunnel could be used to let Worker instances talk to a PostgREST instance hosted on a private network.

Kristian Freeman (Cloudflare)

gcyjh2i69o04nyrfkhwt.jpg

A Simple Example of a LATERAL Join Use Case — A month ago we linked to a piece about understanding LATERAL joins but here's a practical use case.

Luca Ferrari

Citus 10 Now GA in Hyperscale (Citus) for Postgres on Azure — One solely for the Azure users among you, but Azure's own Hyperscale Postgres service now supports the latest Citus release.

Nik Larin (Microsoft)

▶  How Citus Distributes PostgreSQL — If you keep seeing references to Citus (as above) but aren't sure just how it helps distribute Postgres, Marco Slot did a handy talk to bring you up to speed. The slides are really fantastic with lots of diagrams and examples of Citus's operation.

Marco Slot

postgres_fdw Enhancements in Postgres 14 — A preview of some of the changes expected in the foreign data wrapper (FDW) for working with data stored in external Postgres servers.

Ibrar Ahmed

📈 How to Analyze Intraday Stock Data with Postgres, Python & TimescaleDB

Timescale sponsor

Working with Postgres Types — The author has been experimenting with sharing types between Postgres functions.

Jon Udell

10 Common Postgres Errors — Some quick fire common errors and warnings to watch out for, with symptoms and solutions, around things like memory, disk space, and permissions. (This is an older item but recently popped up on our radar again.)

Ibrar Ahmed

PureORM: A Node.js SQL Toolkit for Writing Native SQL Queries Yielding Pure Business Objects — Allows you to write regular native SQL and receive back properly structured (nested) pure business objects, as opposed to a more typical ORM where you build queries in other ways.

Craig Martin

📅 PGDAY Austria is a Postgres conference taking place in Vienna on September 17, 2021 if you're in that part of the world?

rss

New article matched for in your Blogger! rule

Testing Wide Production 2020

maandag 9 augustus 2021

Any good way to track DB changes on a postgre DB?

Via PostgreSQL by /u/ddmmatias

Hey everyone. I have to create a full set of laravel seeders for building environments. I found that I can use a project to generate such seeders from an existing DB, but they are generated table by table.

The thing is that I have to create the data from the app, and the entities that are generated create data in many tables. I'd like to track all changes that are made in the DB after creating a new account, new user, new post, etc.

Is that possible? Maybe like take a snapshop of the DB, do your changes, do a diff of your current DB with your snapshop?

submitted by /u/ddmmatias
[link] [comments]

New article matched for in your Blogger! rule

Testing Wide Production 2020

zaterdag 7 augustus 2021

pgmoneta 0.4.0

The pgmoneta community is happy to announce version 0.4.0.

New features

  • Enhancements to restore and archive functionality

pgmoneta

pgmoneta is a backup / restore solution for PostgreSQL.

Read our getting started guide to setup pgmoneta for your backup needs.

Features

  • Full backup
  • Restore
  • Symlink support
  • Prometheus support
  • Remote management
  • Transport Layer Security (TLS) v1.2+ support
  • Daemon mode
  • User vault

Learn more on our web site or GitHub. Follow on Twitter.

pgmoneta is released under the 3-clause BSD license, and is sponsored by Red Hat.

New article matched for in your Blogger! rule

Testing Wide Production 2020

cary huang: Understanding the Security Around PostgreSQL

1. What Is Security?

The word "Security" is a very broad concept and could refer to completely different procedures and methodology to achieve. Knowing what security means to your application is very important, so you could execute proper security practices and procedures to ensure the safety of your company's assets. Data compromises could often lead to financial loss, reputation damage, consumer confidence disintegration, brand erosion, and non-compliance of government and industry regulation.

For this reason, the security on infrastructure software such as PostgreSQL is even more important because any data compromises could have nation or city-wide impacts, which are often difficult to completely recover.

2. Common Database Compromises

User Compromises:

  • Excessive privileges
  • Privilege abuse
  • Weak user authentication
  • Weak password
  • Default privilege too open

Data Compromises:

  • Unmanaged and unprotected sensitive data
  • Backup data exposure
  • Stolen hard disks
  • Unmanaged encryption keys

Network Compromises:

  • Firewall rules
  • Deep Packet Inspection (DPS)
  • Vulnerability prevention
  • Denial of Service (DOS) attack

Vulnerability:

  • Software bug
  • Buffer overflow
  • SQL injection
  • Privileged escalation

3. The Big Picture

This picture shows different types of "security" around a PostgreSQL server and there are roughly 5 types of security concepts involved here:

3.1 Data Security (Over Network)

This is the security in the communication between PostgreSQL client and server that we almost always want to use TLS to encrypt the data communication in a production environment. TLS guarantees the mutual trust between the client and the server so each side is sure that it is communicating with the right entity instead of a rogue server. SSH tunneling is also a common option to secure a psql connection when TLS is not fully set up. SSH tunneling is also very secure as each connection forces client and server to generate and agree on an encryption key that is valid only for that session. Furthermore, SSH tunneling can be made more secured by setting up the public and private key pair between client and server to ensure the authenticity of the two entities.

3.2 Data Security

This is the security between PostgreSQL and the disk in which it writes data to. This security type is often refereed as a "Cluster Data Encryption" or "Transparent Data Encryption". Current version of PostgreSQL does not support this feature but there is a handful of talented people working on this feature right now. This security is designed to prevent data compromises directly done on the hard disk. By encrypting the data on the disk, hard disk theft will not be able to extract useful information from the hard disk.

3.3 Network Security

This is the security that most likely will involve a firewall in between a connecting client and a server. The purpose of a firewall is to block most of the malicious connections coming from the public network and prevent unauthorized access to the server. Most advanced firewalls such as an IPS can block DOS attacks and perform deep packet examination according to a database of known malicious packet and attacks. There are also firewalls such as an IDS that perform network monitoring only and will raise alert to the operator should it detects an attack attempt.

3.4 Vulnerability

This is the security that is mostly caused by a software bug that allows an attacker to take advantage of the server, steal data, or simply out a stop to the server and cause damage. The best way to prevent this is upgrade your PostgreSQL server to the latest version that has addressed most of the known vulnerabilities.

3.5 User Security

This is the security that relates mostly to the user management, sometimes called a Role-Based Access Control (RBAC). This is where a database administrator is managing each database user and setting the right privileges for the right users. Excessive privileges, weak passwords and privilege abuses are very common if not done correctly. Make sure the right users get the right privileges and use a third party authentication servers such as LDAP or Kerberos instead of simple passwords can significantly increase the security ratings of your database infrastructure.

Cary is a Senior Software Developer in HighGo Software Canada with 8 years of industrial experience developing innovative software solutions in C/C++ in the field of smart grid & metering prior to joining HighGo. He holds a bachelor degree in Electrical Engineering from University of British Columnbia (UBC) in Vancouver in 2012 and has extensive hands-on experience in technologies such as: Advanced Networking, Network & Data security, Smart Metering Innovations, deployment management with Docker, Software Engineering Lifecycle, scalability, authentication, cryptography, PostgreSQL & non-relational database, web services, firewalls, embedded systems, RTOS, ARM, PKI, Cisco equipment, functional and Architecture Design.

The post Understanding the Security Around PostgreSQL appeared first on Highgo Software Inc..

New article matched for in your Blogger! rule

Testing Wide Production 2020

donderdag 5 augustus 2021