NoSQL data store using the seastar framework, compatible with Apache Cassandra

Last update: Dec 27, 2022

Overview

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Comments

c-s latency caused by high latency from peer node

Start 2 nodes n1, n2 using recent scylla master 1fd701e
Enable slow query curl -X POST "http://127.0.0.1:10000/storage_service/slow_query?enable=true&fast=false&threshold=80000" curl -X POST "http://127.0.0.2:10000/storage_service/slow_query?enable=true&fast=false&threshold=80000"
Start c-s cassandra-stress write no-warmup cl=TWO n=5000000 -schema 'replication(factor=2)' -port jmx=6868 -mode cql3 native -rate threads=200 -col 'size=FIXED(5) n=FIXED(8)' -pop seq=1500000000..2500000000
Run repair to make c-s latency high to trigger the slow query tracing

See the following trace, node 127.0.0.2 applies the write very fast (less than 100us), while the remote node 127.0.0.1 took 295677 us. This means the 300ms c-s latency seen by the client (c-s) were mostly contributed by the remote node. Due to the tracing issues I reported here https://github.com/scylladb/scylla/issues/9403, we do not know where the time was spent on the remote took. It might be disk or network or cpu contention. But I have a feeling, the contention is from network when repair runs since we do not have a network scheduler. So the theory is that the remote node applies the write very quickly, but either the network rpc message to send the request or response are contented, so in the end, node 127.0.0.2 got the response with a high latency.

cqlsh> SELECT * from system_traces.events WHERE session_id=ea0a5cc0-2021-11ec-be32-b254958ec4a2;

 session_id                           | event_id                             | activity                                                                                           | scylla_parent_id | scylla_span_id  | source    | source_elapsed | thread
--------------------------------------+--------------------------------------+----------------------------------------------------------------------------------------------------+------------------+-----------------+-----------+----------------+---------
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a770a-2021-11ec-be32-b254958ec4a2 |                                                                                    Checking bounds |                0 | 373048741859841 | 127.0.0.2 |              0 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a770f-2021-11ec-be32-b254958ec4a2 |                                                                             Processing a statement |                0 | 373048741859841 | 127.0.0.2 |              0 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a781b-2021-11ec-be32-b254958ec4a2 | Creating write handler for token: -6493410074079723942 natural: {127.0.0.1, 127.0.0.2} pending: {} |                0 | 373048741859841 | 127.0.0.2 |             27 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a782e-2021-11ec-be32-b254958ec4a2 |                                  Creating write handler with live: {127.0.0.1, 127.0.0.2} dead: {} |                0 | 373048741859841 | 127.0.0.2 |             29 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a7850-2021-11ec-be32-b254958ec4a2 |                                                                 X Sending a mutation to /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |             32 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a786a-2021-11ec-be32-b254958ec4a2 |                                                                     X Executing a mutation locally |                0 | 373048741859841 | 127.0.0.2 |             35 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a7993-2021-11ec-be32-b254958ec4a2 |                                                            Z Finished executing a mutation locally |                0 | 373048741859841 | 127.0.0.2 |             65 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a799c-2021-11ec-be32-b254958ec4a2 |                                                                     Got a response from /127.0.0.2 |                0 | 373048741859841 | 127.0.0.2 |             65 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea0a79e0-2021-11ec-be32-b254958ec4a2 |                                                        Z Finished Sending a mutation to /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |             72 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3794ed-2021-11ec-be32-b254958ec4a2 |                                                                     Got a response from /127.0.0.1 |                0 | 373048741859841 | 127.0.0.2 |         295677 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3794f3-2021-11ec-be32-b254958ec4a2 |                                       Delay decision due to throttling: do not delay, resuming now |                0 | 373048741859841 | 127.0.0.2 |         295677 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea3797f8-2021-11ec-be32-b254958ec4a2 |                                                                    Mutation successfully completed |                0 | 373048741859841 | 127.0.0.2 |         295755 | shard 0
 ea0a5cc0-2021-11ec-be32-b254958ec4a2 | ea379808-2021-11ec-be32-b254958ec4a2 |                                                               Done processing - preparing a result |                0 | 373048741859841 | 127.0.0.2 |         295756 | shard 0

(13 rows)

latency Backport candidate Eng-3

opened by asias 203

Node stuck 12 hours in decommission

Installation details Scylla version (or git commit hash): 3.1.0.rc5-0.20190902.623ea5e3d Cluster size: 4 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-02055ad6b0af5669b

We see that Thrift and CQL ports are closed but nodetool command is stuck

Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-280840-big-Data.db:level=2,
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-279118-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-280714-big-Data.db:level=1, /var/lib/scy
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:31 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] LeveledManifest - Adding high-level (L3) /var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-277690-big-Data.db to ca
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: unbootstrap done
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - Thrift server stopped
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - CQL server stopped
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: shutdown rpc and cql server done
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: stop batchlog_manager done
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] gossip - My status = LEFT
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] gossip - No local state or state is in silent shutdown, not announcing shutdown
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] storage_service - DECOMMISSIONING: stop_gossiping done
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 12] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 9] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 8] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 9] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 4] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 4] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 13] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 2] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 3] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 2] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 3] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 5] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.87.51:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.113.188:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 0] rpc - client 10.0.63.72:7001: client connection dropped: read: Connection reset by peer
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-286124-big-Data.db:level=2,
Sep 04 22:43:32 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-285942-big-Data.db:level=1, ]
Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacted 1 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-286138-big-Data.db:level=2,
Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 6] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-285956-big-Data.db:level=1, ]
Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] compaction - Compacted 9 sstables to [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-282149-big-Data.db:level=3,
Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 7] compaction - Compacting [/var/lib/scylla/data/keyspace1/standard1-2b9793b0ce8f11e98b9a000000000009/mc-278747-big-Data.db:level=3, /var/lib/scy
Sep 04 22:43:34 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] compaction - Compacting [/var/lib/scylla/data/system/large_rows-40550f66085839a09430f27fc08034e9/mc-4252-big-Data.db:level=0, /var/lib/scylla
Sep 04 22:43:35 ip-10-0-142-68.eu-west-1.compute.internal scylla[46048]:  [shard 10] compaction - Compacted 2 sstables to [/var/lib/scylla/data/system/large_rows-40550f66085839a09430f27fc08034e9/mc-4266-big-Data.db:level=0, ].

nodetool stuck more than 12 hours

[root@ip-10-0-142-68 centos]# ps -fp 119286
UID         PID   PPID  C STIME TTY          TIME CMD
centos   119286   1759  0 Sep04 ?        00:00:00 /bin/sh /usr/bin/nodetool -u cassandra -pw cassandra decommission
[root@ip-10-0-142-68 centos]# date
Thu Sep  5 12:57:52 UTC 2019
[root@ip-10-0-142-68 centos]#

Probably related to issue with nodetool drain stuck #4891 and old issue #961

bug Regression

opened by bentsi 197

resharding + alternator LWT -> Scylla service takes 36 minutes to start

Installation details

Kernel Version: 5.13.0-1021-aws

Scylla version (or git commit hash): 2022.1~rc3-20220406.5cc3b678c with build-id 48dfae0735cd8efc4ae2f5c777beaee2a1e89f4a

Cluster size: 4 nodes (i3.4xlarge)

Scylla Nodes used in this run:

alternator-48h-2022-1-db-node-81cb61d9-5 (34.241.246.188 | 10.0.2.75) (shards: 14)
alternator-48h-2022-1-db-node-81cb61d9-4 (52.30.41.107 | 10.0.3.6) (shards: 14)
alternator-48h-2022-1-db-node-81cb61d9-3 (52.214.185.121 | 10.0.1.89) (shards: 14)
alternator-48h-2022-1-db-node-81cb61d9-2 (34.242.68.250 | 10.0.1.112) (shards: 14)
alternator-48h-2022-1-db-node-81cb61d9-1 (176.34.90.117 | 10.0.0.237) (shards: 14)

OS / Image: ami-071c70d20f0fdbb2c (aws: eu-west-1)

Test: longevity-alternator-200gb-48h-test

Test id: 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8

Test name: longevity/longevity-alternator-200gb-48h-test

Test config file(s):

longevity-alternator-200GB-48h.yaml

Issue description

At 2022-04-16 09:34:34.496 a restart with resharding nemesis has started on node 4. The nemesis shuts down the scylla service, edits the murmur3_partitioner_ignore_msb_bits config value to force resharding, and starts the scylla service again exepcting the initialization to take 5 minutes at most. When we start the service, however, it took 36 minutes for scylla to start:

2022-04-16T09:36:11+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - installing SIGHUP handler
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla version 2022.1.rc3-0.20220406.5cc3b678c with build-id 48dfae0735cd8efc4ae2f5c777beaee2a1e89f4a starting ...
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting prometheus API server
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting tokens manager
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting effective_replication_map factory
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting migration manager notifier
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting lifecycle notifier
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating tracing
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating snitch
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting API server
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla API server listening on 127.0.0.1:10000 ...
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage service
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting gossiper
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - seeds={10.0.0.237}, listen_address=10.0.3.6, broadcast_address=10.0.3.6
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage service
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting per-shard database core
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - creating and verifying directories
2022-04-16T09:36:12+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting database
2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting storage proxy
2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting migration manager
2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting query processor
2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing batchlog manager
2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - loading system sstables
2022-04-16T09:40:31+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - loading non-system sstables
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting view update generator
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - setting up system keyspace
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting commit log
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing migration manager RPC verbs
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - initializing storage proxy RPC verbs
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting streaming service
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting hinted handoff manager
2022-04-16T09:50:19+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting messaging service
2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting CDC Generation Management service
2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting CDC log service
2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting storage service
2022-04-16T09:51:20+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting sstables loader
2022-04-16T10:07:47+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting system distributed keyspace
2022-04-16T10:11:47+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting tracing
2022-04-16T10:11:48+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - SSTable data integrity checker is disabled.
2022-04-16T10:11:48+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting auth service
2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting batchlog manager
2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting load meter
2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting cf cache hit rate calculator
2022-04-16T10:11:50+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting view update backlog broker
2022-04-16T10:11:53+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Waiting for gossip to settle before accepting client requests...
2022-04-16T10:12:06+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - allow replaying hints
2022-04-16T10:12:07+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Launching generate_mv_updates for non system tables
2022-04-16T10:12:07+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting the view builder
2022-04-16T10:12:25+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting native transport
2022-04-16T10:12:26+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - starting the expiration service
2022-04-16T10:12:27+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - serving
2022-04-16T10:12:27+00:00 alternator-48h-2022-1-db-node-81cb61d9-4 !    INFO |  [shard 0] init - Scylla version 2022.1.rc3-0.20220406.5cc3b678c initialization completed.

Namely, the loading phases took way longer than usual.

Restore Monitor Stack command: $ hydra investigate show-monitor 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8
Restore monitor on AWS instance using Jenkins job
Show all stored logs command: $ hydra investigate show-logs 81cb61d9-8d3f-45ae-8b50-f7882b4a6af8

Logs:

db-cluster: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/db-cluster-81cb61d9.tar.gz loader-set: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/loader-set-81cb61d9.tar.gz monitor-set: https://cloudius-jenkins-test.s3.amazonaws.com/81cb61d9-8d3f-45ae-8b50-f7882b4a6af8/20220424_100353/monitor-set-81cb61d9.tar.gz

Jenkins job URL

Regression compaction resharding

opened by ShlomiBalalis 143

Coredumps during restart_then_repair_node nemesis

This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

[x] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

Installation details Scylla version (or git commit hash):3.1.0.rc4-0.20190826.e4a39ed31 Cluster size:4 OS (RHEL/CentOS/Ubuntu/AWS AMI):ami-0ececa5cacea302a8

During restart_then_repair_node, the target node (# 5) suffered from streaming exceptions:

(DatabaseLogEvent Severity.CRITICAL): type=DATABASE_ERROR regex=Exception  line_number=26510 node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-5 [52.50.193.198 | 10.0.133.1] (seed: False)
2019-08-27T22:03:51+00:00  ip-10-0-133-1 !WARNING | scylla: [shard 0] range_streamer - Bootstrap with 10.0.10.203 for keyspace=scylla_bench failed, took 773.173 seconds: streaming::stream_exception (Stream failed)

While 2 other nodes suffered from semaphore timeouts (could be related to #4615)

(DatabaseLogEvent Severity.CRITICAL): type=DATABASE_ERROR regex=Exception  line_number=14442 node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
2019-08-27T22:06:09+00:00  ip-10-0-178-144 !ERR     | scylla: [shard 7] storage_proxy - Exception when communicating with 10.0.178.144: seastar::semaphore_timed_out (Semaphore timedout)

and created coredumps like so:

(CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
corefile_urls=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000.gz.aa
backtrace=           PID: 4406 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2019-08-27 21:51:27 UTC (1min 55s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: ip-10-0-178-144.eu-west-1.compute.internal
      Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.4406.1566942687000000
       Message: Process 4406 (scylla) of user 996 dumped core.
                
                Stack trace of thread 4430:
                #0  0x00007f95cfcc953f raise (libc.so.6)
                #1  0x00007f95cfcb395e abort (libc.so.6)
                #2  0x00000000040219ab on_allocation_failure (scylla)

Here I'll add links to all of those kind of coredumps, knowing that there's currently a bit of an issue with uploading them, hoping that one of them uploaded correctly:

https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.16744.1566943317000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.16744.1566943317000000.gz.aa
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17150.1566944278000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17150.1566944278000000.gz.aa
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17686.1566944862000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.17686.1566944862000000.gz.aa
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18167.1566945503000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18167.1566945503000000.gz.aa
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18731.1566946375000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.18731.1566946375000000.gz.aa
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.19423.1566947108000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.19423.1566947108000000.gz.aa
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz.aa

(CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
corefile_urls=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000.gz.aa
backtrace=           PID: 20078 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2019-08-27 23:17:10 UTC (1min 57s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: ip-10-0-178-144.eu-west-1.compute.internal
      Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20078.1566947830000000
       Message: Process 20078 (scylla) of user 996 dumped core.
                
                Stack trace of thread 20089:
                #0  0x00007fa119b2853f raise (libc.so.6)
                #1  0x00007fa119b1295e abort (libc.so.6)
                #2  0x0000000000469b8e _ZN8logalloc18allocating_section7reserveEv (scylla)

Different backtraces and translation during the run:

Aug 27 22:03:29 ip-10-0-10-203.eu-west-1.compute.internal scylla[5160]:  [shard 10] seastar - Failed to allocate 851968 bytes
0x00000000041808b2
0x000000000406d935
0x000000000406dc35
0x000000000406dce3
0x00007f7420b4602f
/opt/scylladb/libreloc/libc.so.6+0x000000000003853e
/opt/scylladb/libreloc/libc.so.6+0x0000000000022894
0x00000000040219aa
0x0000000004022a0e
0x000000000131bcb3
0x000000000137d78f
0x000000000131725f
0x000000000136c8b1
0x00000000014555b5
0x0000000001296442
0x000000000145c35a
0x000000000406ae21
0x000000000406b01e
0x000000000414d06d
0x00000000041776ab
0x00000000040390dd
/opt/scylladb/libreloc/libpthread.so.0+0x000000000000858d
/opt/scylladb/libreloc/libc.so.6+0x00000000000fd6a2

translated:

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/boost/program_options/variables_map.hpp:146
seastar::backtrace_buffer::append_backtrace() at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) print_with_backtrace at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:1104
seastar::print_with_backtrace(char const*) at /usr/include/boost/program_options/variables_map.hpp:146
sigabrt_action at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5012
 (inlined by) _FUN at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5008
?? ??:0
?? ??:0
?? ??:0
seastar::memory::on_allocation_failure(unsigned long) at memory.cc:?
operator new(unsigned long) at ??:?
 (inlined by) operator new(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/memory.cc:1674
seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::expand(unsigned long) at crtstuff.c:?
 (inlined by) seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::expand(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:301
sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&, sstables::promoted_index_blocks_reader::m_parser_context&) at crtstuff.c:?
 (inlined by) seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::maybe_expand(unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:331
 (inlined by) void seastar::circular_buffer<sstables::promoted_index_block, std::allocator<sstables::promoted_index_block> >::emplace_back<position_in_partition, position_in_partition, unsigned long&, unsigned long&, std::optional<sstables::deletion_time> >(position_in_partition&&, position_in_partition&&, unsigned long&, unsigned long&, std::optional<sstables::deletion_time>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/circular_buffer.hh:391
 (inlined by) sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&, sstables::promoted_index_blocks_reader::m_parser_context&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:416
data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::process(seastar::temporary_buffer<char>&) at crtstuff.c:?
 (inlined by) sstables::promoted_index_blocks_reader::process_state(seastar::temporary_buffer<char>&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:456
 (inlined by) data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::process(seastar::temporary_buffer<char>&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:404
seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}) at crtstuff.c:?
 (inlined by) seastar::future<seastar::consumption_result<char> > std::__invoke_impl<seastar::future<seastar::consumption_result<char> >, sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >(std::__invoke_other, sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char>&&) at /usr/include/c++/8/bits/invoke.h:60
 (inlined by) std::__invoke_result<sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >::type std::__invoke<sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char> >(sstables::promoted_index_blocks_reader&, seastar::temporary_buffer<char>&&) at /usr/include/c++/8/bits/invoke.h:96
 (inlined by) std::result_of<sstables::promoted_index_blocks_reader& (seastar::temporary_buffer<char>&&)>::type std::reference_wrapper<sstables::promoted_index_blocks_reader>::operator()<seastar::temporary_buffer<char> >(seastar::temporary_buffer<char>&&) const at /usr/include/c++/8/bits/refwrap.h:319
 (inlined by) seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}::operator()() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:227
 (inlined by) seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::promoted_index_blocks_reader> >(std::reference_wrapper<sstables::promoted_index_blocks_reader>&&)::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:285
sstables::index_reader::advance_upper_past(position_in_partition_view) at crtstuff.c:?
 (inlined by) seastar::future<> seastar::input_stream<char>::consume<sstables::promoted_index_blocks_reader>(sstables::promoted_index_blocks_reader&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:236
 (inlined by) data_consumer::continuous_data_consumer<sstables::promoted_index_blocks_reader>::consume_input() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:377
 (inlined by) sstables::index_entry::get_next_pi_blocks() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/./sstables/index_entry.hh:614
 (inlined by) sstables::index_reader::advance_upper_past(position_in_partition_view) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/index_reader.hh:582
seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) [clone .constprop.7996] at sstables.cc:?
 (inlined by) seastar::apply_helper<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#1}&&, std::tuple) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
 (inlined by) auto seastar::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
 (inlined by) seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
_ZN7seastar12continuationIZZNS_6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS1_IJbEEEEET0_OT_ENKUlvE_clEvEUlSF_E_JEE15run_and_disposeEv at crtstuff.c:?
 (inlined by) seastar::future<bool> seastar::future<>::then<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}, seastar::future<bool> >(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:917
 (inlined by) sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/index_reader.hh:775
 (inlined by) seastar::apply_helper<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#1}&&, std::tuple<>) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
 (inlined by) auto seastar::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
 (inlined by) seastar::future<bool> seastar::futurize<seastar::future<bool> >::apply<sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}>(sstables::index_reader::advance_lower_and_check_if_present(dht::ring_position_view, std::optional<position_in_partition_view>)::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
 (inlined by) _ZZZN7seastar6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS0_IJbEEEEET0_OT_ENKUlvE_clEvENUlSE_E_clINS_12future_stateIJEEEEEDaSE_ at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:950
 (inlined by) _ZN7seastar12continuationIZZNS_6futureIJEE9then_implIZN8sstables12index_reader34advance_lower_and_check_if_presentEN3dht18ring_position_viewESt8optionalI26position_in_partition_viewEEUlvE_NS1_IJbEEEEET0_OT_ENKUlvE_clEvEUlSF_E_JEE15run_and_disposeEv at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:377
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/boost/program_options/variables_map.hpp:146
seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) seastar::reactor::run() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:4243
seastar::smp::configure(boost::program_options::variables_map, seastar::reactor_config)::{lambda()#3}::operator()() const at /usr/include/boost/program_options/variables_map.hpp:146
std::function<void ()>::operator()() const at /usr/include/c++/8/bits/std_function.h:687
 (inlined by) seastar::posix_thread::start_routine(void*) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/posix.cc:52

Aug 27 22:18:38 ip-10-0-10-203.eu-west-1.compute.internal scylla[31878]:  [shard 0] seastar - Failed to allocate 131072 bytes
0x00000000041808b2
0x000000000406d935
0x000000000406dc35
0x000000000406dce3
0x00007f2e2d4c002f
/opt/scylladb/libreloc/libc.so.6+0x000000000003853e
/opt/scylladb/libreloc/libc.so.6+0x0000000000022894
0x00000000040219aa
0x00000000040235f4
0x0000000004124fac
0x0000000000cf0f7d
0x0000000000cf1027
0x00000000036ec4bb
0x0000000004000a85
0x00000000040030f4
0x0000000001523df0
0x0000000001581d82
0x00000000015a5249
0x00000000015a7e14
0x0000000001094cdf
0x000000000109798d
0x000000000109872d
0x000000000109b983
0x000000000109d0d5
0x00000000010c786f
0x00000000010985c1
0x000000000109b983
0x000000000109d0d5
0x00000000010e9aae
0x00000000010eaa19
0x0000000000e52db4
0x000000000406ae21
0x000000000406b01e
0x000000000414d06d
0x0000000003fd51d6
0x0000000003fd6922
0x00000000007d9d69
/opt/scylladb/libreloc/libc.so.6+0x0000000000024412
0x000000000083a1fd

Translation:

void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/boost/program_options/variables_map.hpp:146
seastar::backtrace_buffer::append_backtrace() at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) print_with_backtrace at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:1104
seastar::print_with_backtrace(char const*) at /usr/include/boost/program_options/variables_map.hpp:146
sigabrt_action at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5012
 (inlined by) _FUN at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:5008
?? ??:0
?? ??:0
?? ??:0
seastar::memory::on_allocation_failure(unsigned long) at memory.cc:?
__libc_posix_memalign at ??:?
 (inlined by) posix_memalign at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/memory.cc:1601
seastar::temporary_buffer<unsigned char>::aligned(unsigned long, unsigned long) at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) seastar::file::read_state<unsigned char>::read_state(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/file.hh:481
 (inlined by) seastar::shared_ptr_no_esft<seastar::file::read_state<unsigned char> >::shared_ptr_no_esft<unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:164
 (inlined by) seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> > seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> >::make<unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:266
 (inlined by) seastar::lw_shared_ptr<seastar::file::read_state<unsigned char> > seastar::make_lw_shared<seastar::file::read_state<unsigned char>, unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&>(unsigned long&, unsigned long&, unsigned long&, unsigned int&, unsigned int&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/shared_ptr.hh:416
 (inlined by) seastar::posix_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:2352
checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/sstring.hh:257
 (inlined by) auto do_io_check<checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}, , void>(std::function<void (std::__exception_ptr::exception_ptr)> const&, checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&)::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/disk-error-handler.hh:73
checked_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/sstring.hh:257
tracking_file_impl::dma_read_bulk(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/reader_concurrency_semaphore.cc:184
seastar::future<seastar::temporary_buffer<char> > seastar::file::dma_read_bulk<char>(unsigned long, unsigned long, seastar::io_priority_class const&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/file.hh:421
 (inlined by) seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:256
 (inlined by) seastar::future<seastar::temporary_buffer<char> > seastar::futurize<seastar::future<seastar::temporary_buffer<char> > >::apply<seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}>(seastar::file_data_source_impl::issue_read_aheads(unsigned int)::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:1402
 (inlined by) seastar::file_data_source_impl::issue_read_aheads(unsigned int) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:255
seastar::file_data_source_impl::get() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/fstream.cc:173
seastar::data_source::get() at /usr/include/c++/8/variant:1356
 (inlined by) seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}::operator()() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:206
 (inlined by) seastar::future<> seastar::repeat<seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}>(seastar::future<> seastar::input_stream<char>::consume<std::reference_wrapper<sstables::data_consume_rows_context_m> >(std::reference_wrapper<sstables::data_consume_rows_context_m>&&)::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:285
sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const at crtstuff.c:?
 (inlined by) seastar::future<> seastar::input_stream<char>::consume<sstables::data_consume_rows_context_m>(sstables::data_consume_rows_context_m&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/iostream-impl.hh:236
 (inlined by) data_consumer::continuous_data_consumer<sstables::data_consume_rows_context_m>::consume_input() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/consumer.hh:377
 (inlined by) sstables::data_consume_context<sstables::data_consume_rows_context_m>::read() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/data_consume_context.hh:98
 (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:479
 (inlined by) seastar::apply_helper<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, std::tuple<>&&, std::integer_sequence<unsigned long> >::apply({lambda()#2}&&, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:35
 (inlined by) auto seastar::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/apply.hh:43
 (inlined by) seastar::future<> seastar::futurize<seastar::future<> >::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&, std::tuple<>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1392
 (inlined by) seastar::future<> seastar::future<>::then_impl<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:936
 (inlined by) seastar::future<> seastar::future<>::then<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}, seastar::future<> >(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:917
 (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:480
seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}) at crtstuff.c:?
 (inlined by) seastar::future<> seastar::futurize<void>::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
 (inlined by) seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#1}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#1}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const::{lambda()#2}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
 (inlined by) sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}::operator()() const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/sstables/partition.cc:481
 (inlined by) seastar::future<> seastar::do_void_futurize_helper<seastar::future<> >::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1359
 (inlined by) seastar::future<> seastar::futurize<void>::apply<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
 (inlined by) seastar::future<> seastar::do_until<sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}>(sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}, sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
sstables::sstable_mutation_reader<sstables::data_consume_rows_context_m, sstables::mp_row_consumer_m>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at crtstuff.c:?
flat_mutation_reader::impl::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) flat_mutation_reader::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:337
 (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:308
 (inlined by) apply<mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)>, mutation_reader_merger::reader_and_last_fragment_kind&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1402
 (inlined by) futurize_apply<mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)>, mutation_reader_merger::reader_and_last_fragment_kind&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1474
 (inlined by) parallel_for_each<mutation_reader_merger::reader_and_last_fragment_kind*, mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:129
parallel_for_each<utils::small_vector<mutation_reader_merger::reader_and_last_fragment_kind, 4>&, mutation_reader_merger::prepare_next(seastar::lowres_clock::time_point)::<lambda(mutation_reader_merger::reader_and_last_fragment_kind)> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) mutation_reader_merger::prepare_next(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:307
mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:120
 (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:489
repeat<combined_mutation_reader::fill_buffer(seastar::lowres_clock::time_point)::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) combined_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:500
flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:391
 (inlined by) restricting_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda(flat_mutation_reader&)#1}::operator()(flat_mutation_reader&) const at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:637
 (inlined by) _ZN27restricting_mutation_reader11with_readerIZNS_11fill_bufferENSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEEEUlR20flat_mutation_readerE_EEDcT_S9_ at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:610
 (inlined by) restricting_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:641
flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:384
mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:120
 (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:489
repeat<combined_mutation_reader::fill_buffer(seastar::lowres_clock::time_point)::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:768
 (inlined by) combined_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/mutation_reader.cc:500
flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /usr/include/c++/8/bits/unique_ptr.h:81
 (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.cc:681
apply<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>&> at /usr/include/c++/8/bits/unique_ptr.h:81
 (inlined by) apply<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>&> at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future.hh:1385
 (inlined by) do_until<flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()>, flat_multi_range_mutation_reader<Generator>::fill_buffer(seastar::lowres_clock::time_point) [with Generator = make_flat_multi_range_reader(schema_ptr, mutation_source, const partition_range_vector&, const query::partition_slice&, const seastar::io_priority_class&, tracing::trace_state_ptr, mutation_reader::forwarding)::adapter]::<lambda()> > at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:507
 (inlined by) fill_buffer at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.cc:682
_ZN7seastar8internal8repeaterIZZ19fragment_and_freeze20flat_mutation_readerSt8functionIFNS_6futureIJNS_10bool_classINS_18stop_iteration_tagEEEEEE15frozen_mutationbEEmENKUlRT_RT0_E_clIS2_28fragmenting_mutation_freezerEEDaSD_SF_EUlvE_E15run_and_disposeEv at frozen_mutation.cc:?
 (inlined by) flat_mutation_reader::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/flat_mutation_reader.hh:337
 (inlined by) operator() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/frozen_mutation.cc:259
 (inlined by) run_and_dispose at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/include/seastar/core/future-util.hh:218
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/boost/program_options/variables_map.hpp:146
seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
seastar::reactor::run_some_tasks() at /usr/include/boost/program_options/variables_map.hpp:146
 (inlined by) seastar::reactor::run() at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../src/core/reactor.cc:4243
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:768
seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at /jenkins/workspace/scylla-3.1/relocatable-pkg/scylla/seastar/build/release/../../include/seastar/core/future.hh:768
main at crtstuff.c:?
?? ??:0
_start at ??:?

(CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-4 [34.245.137.134 | 10.0.178.144] (seed: False)
corefile_urls=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000.gz/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000.gz.aa
backtrace=           PID: 20758 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2019-08-27 23:29:37 UTC (1min 52s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 9f0393fe20f04dfab829e5bb5cc4bdad
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: ip-10-0-178-144.eu-west-1.compute.internal
      Coredump: /var/lib/systemd/coredump/core.scylla.996.9f0393fe20f04dfab829e5bb5cc4bdad.20758.1566948577000000
       Message: Process 20758 (scylla) of user 996 dumped core.
                
                Stack trace of thread 20769:
                #0  0x00007f179a95053f raise (libc.so.6)
                #1  0x00007f179a93a95e abort (libc.so.6)
                #2  0x0000000000469b8e _ZN8logalloc18allocating_section7reserveEv (scylla)
                #3  0x00000000071c0d93 n/a (n/a)

The other node's coredumps:

(CoreDumpEvent Severity.CRITICAL): node=Node longevity-large-partitions-4d-3-1-db-node-49dc20d4-1 [63.35.248.143 | 10.0.10.203] (seed: True)
corefile_urls=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000.gz.aa
backtrace=           PID: 5160 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2019-08-27 22:03:29 UTC (1min 55s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-11
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 3f7c927968ca4130a5cfc4b02933017f
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: ip-10-0-10-203.eu-west-1.compute.internal
      Coredump: /var/lib/systemd/coredump/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.5160.1566943409000000
       Message: Process 5160 (scylla) of user 996 dumped core.
                
                Stack trace of thread 5170:
                #0  0x00007f742044b53f raise (libc.so.6)
                #1  0x00007f742043595e abort (libc.so.6)
                #2  0x00000000040219ab on_allocation_failure (scylla)

Other download locaions:

https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.31878.1566944318000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.31878.1566944318000000.gz.aa
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.32438.1566945161000000.gz/core.scylla.996.3f7c927968ca4130a5cfc4b02933017f.32438.1566945161000000.gz.aa

relevant journalctl logs of the nodes can be found on scratch.scylladb.com/shlomib/longevity-large-partitions-4d-db-cluster.tar

bug

opened by ShlomiBalalis 140

Significant fall down of operations per seconds during replace node

Installation details Scylla version (or git commit hash): 4.2.rc4-0.20200914.338196eab with build-id 7670ef1d82ff6b35783e1035d6544c7cc9abd90f Cluster size: 5 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0bb0f15782d03eec3 (eu-north-1) instance type: i3.4xlarge

During job https://jenkins.scylladb.com/view/scylla-4.2/job/scylla-4.2/job/longevity/job/longevity-mv-si-4days-test/5 several times nemesis TerminateAndReplace nemesis executed. This nemesis terminate instance of one node4 and after that it adds new node6. During adding new node for each nemesis execution operations per seconds jump down from 25k ops to 81 ops: Screenshot from 2020-09-17 18-32-33

During second time node5 was terminated and and node8

Screenshot from 2020-09-17 18-41-53

monitoring node available: http://13.49.78.221:3000/d/N0wDzKdGk/scylla-per-server-metrics-nemesis-master?orgId=1&from=1600146424237&to=1600300730028&var-by=instance&var-cluster=&var-dc=All&var-node=All&var-shard=All&var-sct_tags=DatabaseLogEvent&var-sct_tags=DisruptionEvent

Next c-s commands:

2020-09-15 14:49:50.535: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_4mv_5queries.yaml ops'(insert=15,read1=1,read2=1,read3=1,read4=1,read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
2020-09-15 14:50:20.510: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_4mv_5queries.yaml ops'(insert=15,read1=1,read2=1,read3=1,read4=1,read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
2020-09-15 14:50:30.964: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2mv_2queries.yaml ops'(insert=6,mv_p_read1=1,mv_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
2020-09-15 14:51:00.837: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2mv_2queries.yaml ops'(insert=6,mv_p_read1=1,mv_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
2020-09-15 14:51:11.261: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_3si_5queries.yaml ops'(insert=25,si_read1=1,si_read2=1,si_read3=1,si_read4=1,si_read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
2020-09-15 14:51:41.228: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_3si_5queries.yaml ops'(insert=25,si_read1=1,si_read2=1,si_read3=1,si_read4=1,si_read5=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
2020-09-15 14:51:51.676: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-1 [13.48.13.140 | 10.0.1.125] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2si_2queries.yaml ops'(insert=10,si_p_read1=1,si_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns
2020-09-15 14:52:21.540: (CassandraStressEvent Severity.NORMAL): type=start node=Node longevity-mv-si-4d-4-2-loader-node-ca850009-2 [13.53.109.33 | 10.0.2.224] (seed: False)
stress_cmd=cassandra-stress user profile=/tmp/c-s_profile_2si_2queries.yaml ops'(insert=10,si_p_read1=1,si_p_read2=1)' cl=QUORUM duration=5760m -port jmx=6868 -mode cql3 native -rate threads=10 -node 10.0.3.235 -errors skip-unsupported-columns

next schema generated:

CREATE KEYSPACE mview WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE mview.users (
    userid bigint PRIMARY KEY,
    address text,
    email text,
    first_name text,
    initials int,
    last_access timeuuid,
    last_name text,
    password text,
    userdata blob
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW mview.users_by_first_name AS
    SELECT first_name, userid, email
    FROM mview.users
    WHERE first_name IS NOT null AND userid IS NOT null
    PRIMARY KEY (first_name, userid)
    WITH CLUSTERING ORDER BY (userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW mview.users_by_initials AS
    SELECT initials, userid
    FROM mview.users
    WHERE initials IS NOT null AND userid IS NOT null
    PRIMARY KEY (initials, userid)
    WITH CLUSTERING ORDER BY (userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW mview.users_by_email AS
    SELECT email, userid
    FROM mview.users
    WHERE email IS NOT null AND userid IS NOT null
    PRIMARY KEY (email, userid)
    WITH CLUSTERING ORDER BY (userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW mview.users_by_password AS
    SELECT password, userid
    FROM mview.users
    WHERE password IS NOT null AND userid IS NOT null
    PRIMARY KEY (password, userid)
    WITH CLUSTERING ORDER BY (userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW mview.users_by_last_name AS
    SELECT last_name, userid, email
    FROM mview.users
    WHERE last_name IS NOT null AND userid IS NOT null
    PRIMARY KEY (last_name, userid)
    WITH CLUSTERING ORDER BY (userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW mview.users_by_address AS
    SELECT address, userid
    FROM mview.users
    WHERE address IS NOT null AND userid IS NOT null
    PRIMARY KEY (address, userid)
    WITH CLUSTERING ORDER BY (userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE KEYSPACE keyspace1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '5'}  AND durable_writes = true;

CREATE TABLE keyspace1.standard1 (
    key blob PRIMARY KEY,
    "C0" blob,
    "C1" blob,
    "C2" blob,
    "C3" blob,
    "C4" blob,
    aqpq3qgcom list<frozen<set<timestamp>>>,
    b69k9r389z list<frozen<map<frozen<map<frozen<map<bigint, timeuuid>>, frozen<map<bigint, bigint>>>>, frozen<set<tinyint>>>>>,
    bdbs5ixqdq map<frozen<map<frozen<set<inet>>, frozen<set<frozen<list<date>>>>>>, frozen<list<frozen<set<decimal>>>>>,
    f4xwkb2zcm set<frozen<map<frozen<set<timestamp>>, frozen<map<timeuuid, inet>>>>>,
    fywh69a04j set<frozen<map<frozen<set<varint>>, frozen<set<frozen<map<frozen<map<smallint, inet>>, varint>>>>>>>,
    hacdvjo18p set<frozen<list<frozen<map<smallint, bigint>>>>>,
    iopuqysiqf list<frozen<map<frozen<set<frozen<list<text>>>>, frozen<map<frozen<set<ascii>>, ascii>>>>>,
    jxu8tsm8v5 set<frozen<map<frozen<list<blob>>, frozen<map<frozen<list<text>>, frozen<map<int, smallint>>>>>>>,
    ki1u5t67nf set<frozen<set<ascii>>>,
    l8pw46826p list<frozen<map<frozen<list<date>>, frozen<list<frozen<map<ascii, double>>>>>>>,
    oj5epbs4pn list<frozen<set<frozen<map<smallint, int>>>>>,
    ortj1um8mc set<frozen<list<frozen<map<float, double>>>>>,
    p8v0kjmfsr list<frozen<list<varint>>>,
    rulnhv7azy set<frozen<set<frozen<set<float>>>>>,
    si5zsclur2 map<frozen<map<frozen<list<bigint>>, boolean>>, frozen<set<frozen<set<float>>>>>,
    v3p7qqv1vn list<frozen<list<bigint>>>,
    wyhqruomlw map<frozen<map<frozen<set<int>>, frozen<set<smallint>>>>, frozen<set<frozen<list<frozen<map<ascii, int>>>>>>>,
    yskieerio3 set<frozen<list<frozen<map<frozen<list<timeuuid>>, frozen<set<decimal>>>>>>>
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';


CREATE KEYSPACE sec_index WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE sec_index.users (
    userid bigint PRIMARY KEY,
    address text,
    email text,
    first_name text,
    initials int,
    last_access timeuuid,
    last_name text,
    password text,
    userdata blob
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 4678
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';
CREATE INDEX users_initials_ind ON sec_index.users (initials);
CREATE INDEX users_last_name_ind ON sec_index.users (last_name);
CREATE INDEX users_last_access_ind ON sec_index.users (last_access);
CREATE INDEX users_first_name_ind ON sec_index.users (first_name);
CREATE INDEX users_address_ind ON sec_index.users (address);

CREATE MATERIALIZED VIEW sec_index.users_address_ind_index AS
    SELECT address, idx_token, userid
    FROM sec_index.users
    WHERE address IS NOT NULL
    PRIMARY KEY (address, idx_token, userid)
    WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW sec_index.users_first_name_ind_index AS
    SELECT first_name, idx_token, userid
    FROM sec_index.users
    WHERE first_name IS NOT NULL
    PRIMARY KEY (first_name, idx_token, userid)
    WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW sec_index.users_initials_ind_index AS
    SELECT initials, idx_token, userid
    FROM sec_index.users
    WHERE initials IS NOT NULL
    PRIMARY KEY (initials, idx_token, userid)
    WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW sec_index.users_last_access_ind_index AS
    SELECT last_access, idx_token, userid
    FROM sec_index.users
    WHERE last_access IS NOT NULL
    PRIMARY KEY (last_access, idx_token, userid)
    WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

CREATE MATERIALIZED VIEW sec_index.users_last_name_ind_index AS
    SELECT last_name, idx_token, userid
    FROM sec_index.users
    WHERE last_name IS NOT NULL
    PRIMARY KEY (last_name, idx_token, userid)
    WITH CLUSTERING ORDER BY (idx_token ASC, userid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
    AND comment = ''
    AND compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

No reactor stalls were detected during this.

All db logs: https://cloudius-jenkins-test.s3.amazonaws.com/ca850009-fb1d-4d43-ac60-0fdbce75cc71/20200916_203618/db-cluster-ca850009.zip

bug repair-based-operations

opened by aleksbykov 132

Performance regression of 780% in p99th latency compared to 2.2.0 for 100% read test

Installation details Scylla version (or git commit hash): 2.3.rc0-0.20180722.a77bb1fe3 Cluster size: 3 OS (RHEL/CentOS/Ubuntu/AWS AMI): AWS AMI (ami-905252ef) instance type: i3.4xlarge

test_latency_read results showing 780% regression in p99th latency compared to 2.2.0:

Version | Op rate total | Latency mean | Latency 99th percentile -- | -- | -- | -- 2.2.0 | 39997.0 [2018-07-19 10:26:37] | 1.4 [2018-07-19 10:26:37] | 3.1 [2018-07-19 10:26:37] 2.3.0 | 37200.0 (6% Regression) | 8.2 (485% Regression) | 27.3 (780% Regression)

2.3.0 p99th latency looks abnormal and reaches peaks of ~400ms:

Test is populating 1TB of data and then start a c-s read command: cassandra-stress read no-warmup cl=QUORUM duration=50m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=100 limit=10000/s' -errors ignore -col 'size=FIXED(1024) n=FIXED(1)' -pop 'dist=gauss(1..1000000000,500000000,50000000)' (During the first part of the test we can still see compactions that are leftovers of the write population)

Full screenshot:
bug performance Regression

opened by roydahan 127
Some shards get stuck in tight loop during repair
This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

[x] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

Installation details Scylla version (or git commit hash): 5.0.1 Cluster size: 5 OS (RHEL/CentOS/Ubuntu/AWS AMI): Ubuntu 20.04

Hardware details (for performance issues) Delete if unneeded Platform (physical/VM/cloud instance type/docker): Hetzner Hardware: sockets=1 cores=4 hyperthreading=8 memory=64G Disks: 2x SSD in RAID1

A few shards on one of my nodes got stuck in a tight loop while running a repair operation. It has been going for a day and is not making any progress. All the while CPU usage on three cores is stuck at 100%:

Restarting the node also hangs, until it eventually gets killed by systemd. When the node restarts the same shards get stuck again shortly after initialization.

I have exported logs for the node that is getting stuck: https://pixeldrain.com/u/rwtFViqp And the repair master: https://pixeldrain.com/u/PjML3Y3F

Some of my data has become inconsistent after some downtime and I have no way to repair it now.. please help.
bug User Request
opened by Fornax96 117
test_latency_mixed_with_nemesis - latency during "steady state" get to 20 ms without heavy stalls
Installation details Scylla version (or git commit hash): 4.4.dev-0.20210114.32fd38f34 with build-id 0642bb3b142094f1092b0d276f6efa858081fe96 Cluster size: 3 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-012cafbb2dc4f1e4d (eu-west-1)

running mixed workload with the command: cassandra-stress mixed no-warmup cl=QUORUM duration=350m -schema 'replication(factor=3)' -port jmx=6868 -mode cql3 native -rate 'threads=50 throttle=3500/s' -col 'size=FIXED(128) n=FIXED(8)' -pop 'dist=gauss(1..250000000,125000000,12500000)'

during the steady state, the only stalls detected were:

2021-01-15T06:40:39+00:00 perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO | scylla: Reactor stalled for 6 ms on shard 5. 2021-01-15T06:48:16+00:00 perf-latency-nemesis-perf-v10-db-node-9420ec57-3 !INFO | scylla: Reactor stalled for 6 ms on shard 5. 2021-01-15T06:51:27+00:00 perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO | scylla: Reactor stalled for 6 ms on shard 4. 2021-01-15T06:58:25+00:00 perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO | scylla: Reactor stalled for 6 ms on shard 6. 2021-01-15T06:59:50+00:00 perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO | scylla: Reactor stalled for 6 ms on shard 7. 2021-01-15T07:07:13+00:00 perf-latency-nemesis-perf-v10-db-node-9420ec57-2 !INFO | scylla: Reactor stalled for 6 ms on shard 5.

the values for the steady state latency are: Metric name | Metric value -----------------| ------------------ "c-s P95" | "5.40" "c-s P99" |"19.10" "Scylla P99_read - node-3" | "19.20" "Scylla P99_write - node-1" | "13.76" "Scylla P99_read - node-2" | "23.66" "Scylla P99_write - node-2" | "13.79" "Scylla P99_read - node-1" | "23.55" "Scylla P99_write - node-3" | "1.56"

there is a live monitor here

here is a live snapshot (if monitor dies)

from the monitor, we can see: c-s latency

and Scylla latency:

comparing with the original document where we checked these values, we have: for Scylla 4.1:

Metric name | read value | write value -----------------|----------------|--------------- Mean | 0.9 ms | 0.4 ms P95 | 7.8 ms | 1.4 ms P99 | 48.2 ms | 2.5 ms Max | 71 ms | 71 ms

for Scylla 666.development-0.20200910.02ee0483b: Metric name | read value | write value -----------------|----------------|--------------- Mean | 0.7 ms | 0.3 ms P95 | 3.6 ms | 0.9 ms P99 | 6 ms | 1.2 ms Max | 16.8 ms | 16.8 ms

all the nodes logs can be downloaded here

even c-s 95th is too high for a steady state time:
bug latency
opened by fgelcer 116

Permanent read/write fails after "bad_alloc"

This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

[*] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

Installation details Scylla version (or git commit hash): 3.2.4 Cluster size: 5+5 (multi DC) OS (RHEL/CentOS/Ubuntu/AWS AMI): C7.5

Platform (physical/VM/cloud instance type/docker): bare metal Hardware: sockets=2 x Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz cores=40 hyperthreading=yes memory= 6x32GB DDR4 2666MHz Disks: RAID 10 of 10HDDs 14TB each for data, RAID 1 SSD 1TB for clogs

Hi!

The problem started with errors like "exception during mutation write to 10.161.180.24: std::bad_alloc (std::bad_alloc)" and led to one shard constantly failing a lot of (probably all) write/read operations until scylla-server was manually restarted. I guess that this can be due to having large partitions so here is what we have on that (we have 2 CFs):

becca/events histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)
50%             2.00             16.00          47988.50               770                29
75%             2.00             20.00          79061.50              5722               215
95%             6.00             33.00         185724.05             88148              2759
98%             8.00             36.00         239365.28            182785              5722
99%            10.00             46.73         295955.11            263210              8239
Min             0.00              1.00             20.00                73                 2
Max            24.00          29492.00        2051039.00         464228842           5839588

becca/events_by_ip histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)
50%             0.00             16.00              0.00              6866               179
75%             0.00             19.75              0.00             29521               770
95%             0.00             33.00              0.00            315852              8239
98%             0.00             41.00              0.00            785939             20501
99%             0.00             48.43              0.00           1629722             42510
Min             0.00              1.00              0.00                73                 0
Max             0.00          19498.00              0.00         386857368           4866323

Anyway if some big query arrived and failed I do not quite understand why all subsequent queries failed until the node was restarted.

Logs: https://cloud.mail.ru/public/C3AZ/RxPZyKUV6

Dashboard (by shard)

bug User Request hinted-handoff bad_alloc

opened by gibsn 114

Cassandra Stress times out: BusyPoolException: no available connection and timed out after 5000 MILLISECONDS / using shard-aware driver, get the 1tb longevity to overload

Installation details Scylla version (or git commit hash): 4.3.rc2-0.20201126.bc922a743 with build-id 840fd4b3f6304765c03e886269b1c2550bf23e53 Cluster size: 4 OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-09f30667ba6e09e9b (eu-west-1) Scenario: 1tb-7days

Half an hour into the stress' run, at 15:15, a consistent BusyPoolException from three of the four nodes, which continued throughout the entire remaining run of the stress:

15:15:22.497 [Thread-641] DEBUG c.d.driver.core.RequestHandler - [1227134168-0] Error querying 10.0.0.5/10.0.0.5:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.0.5/10.0.0.5:9042] Pool is busy (no available connection and the queue has reached its max size 256)
...
15:32:59.650 [cluster1-nio-worker-21] DEBUG c.d.driver.core.RequestHandler - [540726118-0] Error querying 10.0.0.5/10.0.0.5:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.0.5/10.0.0.5:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)

15:25:50.717 [Thread-177] DEBUG c.d.driver.core.RequestHandler - [544250492-0] Error querying 10.0.3.37/10.0.3.37:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.3.37/10.0.3.37:9042] Pool is busy (no available connection and the queue has reached its max size 256)
...
15:32:59.638 [cluster1-nio-worker-29] DEBUG c.d.driver.core.RequestHandler - [640744570-0] Error querying 10.0.1.149/10.0.1.149:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.1.149/10.0.1.149:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)

15:32:59.638 [cluster1-nio-worker-29] DEBUG c.d.driver.core.RequestHandler - [640744570-0] Error querying 10.0.1.149/10.0.1.149:9042 : com.datastax.driver.core.exceptions.BusyPoolException: [10.0.1.149/10.0.1.149:9042] Pool is busy (no available connection and timed out after 5000 MILLISECONDS)

At the same time, the stress experienced consistent WriteTimeoutException, since the stress failed to achieve quorum:

com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 1 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)
com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during SIMPLE write query at consistency QUORUM (2 replica were required but only 0 acknowledged the write)

com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.1.149/10.0.1.149:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.0.5/10.0.0.5:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.3.37/10.0.3.37:9042] Timed out waiting for server response
com.datastax.driver.core.exceptions.OperationTimedOutException: [10.0.3.37/10.0.3.37:9042] Timed out waiting for server response

At 16:18, the stress starts to experience EMPTY RESULT errors:

16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27584 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27648 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.3.37:9042-7, inFlight=128, closed=false] Response received on stream 27712 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT

16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 32640 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 32704 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.0.5:9042-11, inFlight=128, closed=false] Response received on stream 0 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT

16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16320 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16384 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT
16:18:24.362 [cluster1-nio-worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.149:9042-4, inFlight=128, closed=false] Response received on stream 16448 but no handler set anymore (either the request has timed out or it was closed due to another error). Received message is EMPTY RESULT

Weirdly enough, node#4 ,10.0.1.77, does not seem to experience any timeouts. In fact, the messages in the stress' log I see in that time period are healthy heartbeat messages:

14:42:15.661 [cluster1-nio-worker-3] DEBUG com.datastax.driver.core.Connection - Connection[/10.0.1.77:9042-2, inFlight=1, closed=false] Keyspace set to keyspace1
16:19:32.899 [cluster1-reconnection-0] DEBUG com.datastax.driver.core.Host.STATES - [10.0.1.77/10.0.1.77:9042] preparing to open 1 new connections, total = 15
16:19:32.901 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] Connection established, initializing transport
16:19:32.937 [cluster1-nio-worker-17] DEBUG c.d.s.netty.handler.ssl.SslHandler - [id: 0x14eb560e, L:/10.0.1.115:48940 - R:10.0.1.77/10.0.1.77:9042] HANDSHAKEN: TLS_RSA_WITH_AES_128_CBC_SHA
16:19:41.082 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Host.STATES - [10.0.1.77/10.0.1.77:9042] Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] Transport initialized, connection ready
16:20:03.838 [cluster1-reconnection-0] DEBUG com.datastax.driver.core.Host.STATES - [Control connection] established to 10.0.1.77/10.0.1.77:9042
16:20:33.809 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
16:20:41.918 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
16:21:11.926 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
16:21:13.881 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
16:21:43.882 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
16:21:48.369 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded
16:22:18.373 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] was inactive for 30 seconds, sending heartbeat
16:22:22.816 [cluster1-nio-worker-17] DEBUG com.datastax.driver.core.Connection - Connection[10.0.1.77/10.0.1.77:9042-28, inFlight=0, closed=false] heartbeat query succeeded

Screenshot from 2020-12-03 13-44-38

Screenshot from 2020-12-03 14-04-38

From the looks of the metrics of both foreground and background writes per instance, it seems that node#4 indeed receives less writes than any other node. Perhaps it's possible that this fact caused the inflight hints messages of the other nodes to fill up, considering that in the previous errors nodes 1-3 reported that inFlight=128? Perhaps there is an issue with key distribution between the nodes, which caused the other nodes to receive more stress than they could have handled.

The failed stress command:

cassandra-stress write cl=QUORUM n=1100200300 -schema 'replication(factor=3) compaction(strategy=LeveledCompactionStrategy)' -port jmx=6868 -mode cql3 native -rate threads=1000 -col 'size=FIXED(200) n=FIXED(5)' -pop seq=1..1100200300

Other prepare stresses for this run:

cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=LZ4Compressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=lz4 -rate threads=50 -pop seq=1..50000000 -log interval=5
cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=SnappyCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=snappy -rate threads=50 -pop seq=1..50000000 -log interval=5
cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=DeflateCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=none -rate threads=50 -pop seq=1..50000000 -log interval=5
cassandra-stress write cl=QUORUM n=50000000 -schema 'replication(factor=3) compression=ZstdCompressor compaction(strategy=SizeTieredCompactionStrategy)' -port jmx=6868 -mode cql3 native compression=none -rate threads=50 -pop seq=1..50000000 -log interval=5

(Each of them runs once, spread across 2 loaders)

Node list:

longevity-tls-1tb-7d-4-3-db-node-66a319cd-1 [34.243.3.190 | 10.0.1.149]
longevity-tls-1tb-7d-4-3-db-node-66a319cd-2 [54.246.50.198 | 10.0.0.5] 
longevity-tls-1tb-7d-4-3-db-node-66a319cd-3 [54.247.54.152 | 10.0.3.37]
longevity-tls-1tb-7d-4-3-db-node-66a319cd-4 [52.211.7.163 | 10.0.1.77]

Logs:

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                            Log links for testrun with test id 66a319cd-223d-450b-8f0f-2bb423d39693                                                                                            |
+-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Date            | Log type    | Link                                                                                                                                                                                                                          |
+-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 20190101_010101 | prometheus  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/prometheus_snapshot_20201202_164129.tar.gz                                                                                                |
| 20201202_163157 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_163157/grafana-screenshot-overview-20201202_163158-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png                          |
| 20201202_163157 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_163157/grafana-screenshot-scylla-per-server-metrics-nemesis-20201202_163545-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png |
| 20201202_164145 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_164145/grafana-screenshot-overview-20201202_164145-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png                          |
| 20201202_164145 | grafana     | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_164145/grafana-screenshot-scylla-per-server-metrics-nemesis-20201202_164500-longevity-tls-1tb-7d-4-3-monitor-node-66a319cd-1.png |
| 20201202_165046 | db-cluster  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/db-cluster-66a319cd.zip                                                                                                   |
| 20201202_165046 | loader-set  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/loader-set-66a319cd.zip                                                                                                   |
| 20201202_165046 | monitor-set | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/monitor-set-66a319cd.zip                                                                                                  |
| 20201202_165046 | sct-runner  | https://cloudius-jenkins-test.s3.amazonaws.com/66a319cd-223d-450b-8f0f-2bb423d39693/20201202_165046/sct-runner-66a319cd.zip                                                                                                   |
+-----------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

To start the monitor using hydra:

hydra investigate show-monitor 66a319cd-223d-450b-8f0f-2bb423d39693

longevity overload

opened by ShlomiBalalis 109

some non-prepared statements can leak memory (with set/map/tuple/udt literals)
This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at [email protected] or in our slack channel.

[*] I have read the disclaimer above, and I am reporting a suspected malfunction in Scylla.

Installation details Scylla version: 4.0.4 Cluster size: 10 nodes, 4 shards per node OS: Ubuntu

After running ok for a few days, nodes consistently start having 'bad_alloc' errors, even though we do not have a lot of Data files (~1850 Data files) and our data size (400G per node) is not that great comparing to the memory available to the node (90G for 4 shards, so about 22G per shard).

Aug 11 04:31:13 fr-eqx-scylla-04 scylla[10177]: WARN 2020-08-11 04:31:13,007 [shard 0] storage_proxy - Failed to apply mutation from 192.168.96.47#0: std::bad_alloc (std::bad_alloc)

Our non_lsa memory is always growing and at some point it just start having bad_alloc once it reaches a level:

bug User Request bad_alloc
opened by withings-sas 104
configure: don't reduce parsers' optimization level to 1 in release

The line modified in this patch was supposed to increase the optimization levels of parsers in debug mode to 1, because they were too slow otherwise. But as a side effect, it also reduced the optimization level in release mode to 1. This is not a problem for the CQL frontend, because statement preparation is not performance-sensitive, but it is a serious performance problem for Alternator, where it lies in the hot path.

Fix this by only applying the -O1 to debug modes.

opened by michoecho 3
repair: finish repair immediately on local keyspaces

Keyspaces with local replication strategy do not need to be repaired. Thus, for such keyspaces repair_service::do_repair_start returns 0 (instead of repair's sequence number) immediately.

opened by Deexie 3
Timeout point send in `forward_request` verb comes from `seastar::lowres_clock`

In the parallelized aggregation layer query::forward_request (that is sent to remote nodes) carries information about timeouts using lowres_clock::time_point (that came from local seastar::lowres_clock). The lowres_clock::time_point is later used on remote nodes to designate a timeout date. This is wrong and may lead to delayed or premature timeout.
bug P2

opened by havaker 1
test: test_topology: make test_nodes_with_different_smp less hacky

The test would use a trick to start a separate Scylla cluster from the one provided originally by the test framework. This is not supported by the test framework and may cause unexpected problems.

Change the test to perform regular node operations. Instead of starting a fresh cluster of 3 nodes, we join the first of these nodes to the original framework-provided cluster, then decommission the original nodes, then bootstrap the other 2 fresh nodes.

Also add some logging to the test.

Refs: #12438, #12442

opened by kbr-scylla 4
doc: replace Scylla with ScyllaDB on the menu tree and major links
This PR replaces "Scylla" with "ScyllaDB" in the most exposed places, including the left menu panel (this involves updating the page titles) and links on the index pages. This PR partly satisfies the requirements of https://github.com/scylladb/scylla-docs/issues/3962.

If a page or section title is updated, the markup must also be updated (it must be at least as long as the title text). Example:

OK:

ScyllaDB Hinted Handoff =======================

OK:

ScyllaDB Hinted Handoff ===========================

WRONG:

ScyllaDB Hinted Handoff ==============
Documentation
opened by annastuchlik 0
tools: toolchain: drop s390x from prepare script architecture list

It's been a long while since we built ScyllaDB for s390x, and in fact the last time I checked it was broken on the ragel parser generator generating bad source files for the HTTP parser. So just drop it from the list.

I kept s390x in the architecture mapping table since it's still valid.

opened by avikivity 1