devstack-plugin-ceph/94b582261cc795407188a0479439f36715210691
Gerrit User 32594 68306ddff5 Update patch set 44
Patch Set 44:

(9 comments)

Patch-set: 44
Attention: {"person_ident":"Gerrit User 32594 \u003c32594@4a232e18-c5a9-48ee-94c0-e04e7cca6543\u003e","operation":"REMOVE","reason":"\u003cGERRIT_ACCOUNT_32594\u003e replied on the change"}
2023-07-24 19:07:16 +00:00

343 lines
17 KiB
Plaintext

{
"comments": [
{
"unresolved": true,
"key": {
"uuid": "b976f754_1c80ca65",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T14:19:03Z",
"side": 1,
"message": "Why are you doing this? I think it\u0027d be better to leave it the way it is - have `CEPHADM` be the actual path to the tool and call it with `sudo` when appropriate (even if it\u0027s always).",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "e7bdde55_9d73f8a1",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 32594
},
"writtenOn": "2023-05-23T17:02:22Z",
"side": 1,
"message": "In the case that remote_ceph is true, the command to get to cephadm shell is \"ssh $SSH_USER@$CEPH_IP sudo ${TARGET_BIN}/cephadm\" and that is what im using as $CEPHADM. I need sudo to be part of the command othewise I cant reuse CEPHADM as the situation calls. good thing is that cephadm is /always/ called with sudo so its an easy thing to do.",
"parentUuid": "b976f754_1c80ca65",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "24cf2225_bc832786",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T17:32:30Z",
"side": 1,
"message": "I think you should explicitly put the sudo in the ssh command. However, I still think sshing between the nodes like this is less ideal than letting ansible copy the things that we need (like has been the case for this job for a while).\n\nWhy do you need to ssh to the main node to run `cephadm` commands? Before we just needed ceph config and keys (right?)... what else needs to get run?\n\nPersonally I think that wrapping cephadm in an ssh-to-main-node is likely to be confusing.",
"parentUuid": "e7bdde55_9d73f8a1",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "1bdb1c66_24511584",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 4146
},
"writtenOn": "2023-05-23T17:45:56Z",
"side": 1,
"message": "When devstack-gate and devstack grew multinode support one of the first things we did was give up trying to manually ssh back and forth between nodes and instead rely on ansible for that communications. The major upside for this is you can pretty easily express things like ordering and common tasks without repeating yourself and needing variable overrides like this.\n\nI would recommend you look into using ansible for this instead.\n\nWhere it might get a bit weird is that I think ansible for the most part is driving the devstack shell scripts. What this means is you don\u0027t really write ansible in here instead you would need to write the plugin in such a way that when ansible triggers devstack shell on the controller and computes and storage nodes (I don\u0027t know the actual layout these days) these scripts need to do the correct thing. Basically remove explicit ssh from here and instead operate under the ansible orchestration that is already occuring.",
"parentUuid": "24cf2225_bc832786",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "6bc4cf2a_93dbc1b2",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 16643
},
"writtenOn": "2023-05-23T20:03:52Z",
"side": 1,
"message": "That\u0027s a good point. \n\nwe were trying to design this with devstack\u0027s install phases.. i.e., \"install\" will cause the creation of the ceph cluster and \"post-config\" will cause creation of the ceph resources and configuration required by devstack\u0027s services.. \n\nansiblizing this would be cleaner; but if we stick to this approach, it would require us to enhance devstack so that \"stack.sh\" can be invoked to run those install phases on demand: \n\n\n - ./stack.sh -p override_defaults\n - ./stack.sh -p source\n - ./stack.sh -p pre-install\n - ./stack.sh -p install \n - wait for ceph to be ready\n - ./stack.sh -p post-config\n - ./stack.sh -p extra\n - ./stack.sh -p test-config\n \n\nThis would be a good improvement for CI, but also limit our ability to do this locally without similar smarts",
"parentUuid": "1bdb1c66_24511584",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "bd2d5950_f42148b1",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T20:45:05Z",
"side": 1,
"message": "To be fair, I think this is *currently* ansible-ized (at least in upstream CI) so hacking in the ssh stuff is a bit of a regression.\n\nI think that in general just the ceph config and keys are required to be on the subnode, so ssh\u0027ing between them to sudo-run things is a somewhat large jump. Could we not pre-generate that for both nodes in a way that makes it fully parallelizable or reproducible in a smaller environment for people?\n\nEither way, I\u0027m very much in favor of moving to more ansible for this stuff, not less.",
"parentUuid": "6bc4cf2a_93dbc1b2",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "1145e893_12be3511",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 16643
},
"writtenOn": "2023-05-23T21:36:07Z",
"side": 1,
"message": "hmmm, i see. As an alternative (because i\u0027m still indexing ansiblizing this by breaking up and running the stack.sh script in stages), would it be okay to just run cephadm from the host locally? we were avoiding installing the binary on all the nodes (ashley, keep me honest - this should work)",
"parentUuid": "bd2d5950_f42148b1",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "5f5dddaa_89b8a89c",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T21:49:42Z",
"side": 1,
"message": "Yeah I guess I\u0027m confused about that part too. Shouldn\u0027t the subnode just need the conf and keys? Why does it need any ceph stuff installed at all? Could be totally my ignorance, but I thought we only need that on the main node where the ceph server side runs and that copying the `/etc/ceph` stuff in the ansible role (which is already happening) should be basically still applicable? Why does deploying ceph with cephadm on the main node change what we need on the subnode for nova, manila, etc?",
"parentUuid": "1145e893_12be3511",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "8218d0c1_8ae8e1a4",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 32594
},
"writtenOn": "2023-05-24T13:24:09Z",
"side": 1,
"message": "So here in the case where we have two nodes, controller for Ceph and subnode for Manila etc. the pre-install and install phases will run using cephadm on the controller. The install ceph config stuff is in the local.conf in that node. In the subnode during devstack install, only the post-config should run and the ceph config options that set up those services are in that local.conf. Post-config will enable services, add pools, create keys and after I scp /etc/ceph to the subnode (I\u0027m not sure where this happens previously as Dan mentioned), configures clients. \n\nI\u0027ve just realized that I\u0027m still running pre-install in the subnode, I\u0027ll have to change that I think. I havent been able to run this whole thing locally because I dont have the correct fsid/cluster plumbed into post-config running on my subnode, so it stops in set_min_client_version for cinder. To fix this, I just need a way to get the cluster id from the controller (The first line from start_ceph will suffice.) I\u0027ll admit I\u0027ve gotten a little confused about client_config, I see theres a ceph.conf being made there, which would override what I got from the scp I did previously. Another option is to have the FSID be part of the local.conf similar to what I tried with SSH_USER (this helps for local deployments mostly). I think much of the confusion here might be because I tried to make this in a way that would work both locally and through CI.\n\nIf we only want to run ceph in one node then all the ceph config options would be in the controller node (including the service configs like manila) BUT I still need to get the client keys and such to the manila node so it can communicate with the cluster. Does the ansible layer copy the /etc/ceph contents or is that somewhere else? Would only copying that file suffice?",
"parentUuid": "5f5dddaa_89b8a89c",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "679a11a7_6c73b143",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 4393
},
"writtenOn": "2023-05-24T13:29:58Z",
"side": 1,
"message": "\u003e Does the ansible layer copy the /etc/ceph contents or is that somewhere else? Would only copying that file suffice?\n\nYeah, as I linked below, it copies `/etc/ceph` to the subnode:\n\nhttps://opendev.org/openstack/devstack/src/branch/master/roles/sync-controller-ceph-conf-and-keys/tasks/main.yaml\n\nThat should have the ceph config and keys. Here\u0027s an example from an existing multinode ceph run of what gets copied to the subnode:\n\nhttps://6a4a4acdbde8f30a98f4-d9273944e714c02206b4a053d7e2acce.ssl.cf5.rackcdn.com/879682/6/check/nova-live-migration-ceph/73342c7/compute1/logs/ceph/\n\nSince that will expire, let me confirm for posterity that the directory listing includes `ceph.conf` and `ceph.client.*.keyring` files.",
"parentUuid": "8218d0c1_8ae8e1a4",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": false,
"key": {
"uuid": "9bbaac91_bb574302",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 136,
"author": {
"id": 32594
},
"writtenOn": "2023-07-24T19:07:16Z",
"side": 1,
"message": "Done",
"parentUuid": "679a11a7_6c73b143",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "ab39a40c_0d25f0bf",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 510,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T14:19:03Z",
"side": 1,
"message": "Running cat as sudo will not allow it to write to `$CEPH_CONF_DIR` because the file is actually opened by the shell. You want something like:\n```\ncat \u003c\u003cEOF | sudo tee $CEPH_CONF_DIR/$CEPH_CLUSTER_IS_READY.txt\n...\n```\nHowever since you only seem to care that it\u0027s created, you might as well just:\n```\nsudo touch $CEPH_CONF_DIR/$CEPH_CLUSTER_IS_READY.txt\n```",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "dc2c6e72_ce133342",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 510,
"author": {
"id": 32594
},
"writtenOn": "2023-05-23T17:02:22Z",
"side": 1,
"message": "ah yes this makes sense, thanks. Havent worked with bash script before this, so I\u0027m learning as go",
"parentUuid": "ab39a40c_0d25f0bf",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": false,
"key": {
"uuid": "1ae96d86_68983334",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 510,
"author": {
"id": 32594
},
"writtenOn": "2023-05-24T13:24:09Z",
"side": 1,
"message": "Done",
"parentUuid": "dc2c6e72_ce133342",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "72f57bfb_0c379175",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 515,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T14:19:03Z",
"side": 1,
"message": "Where is this called from?\n\nI assume it\u0027s going to be from the subnode to the master one? I\u0027m not sure this polling over ssh is really the best way to do it, but I\u0027d have to ask.",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "897ee047_856587a5",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 515,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T14:46:02Z",
"side": 1,
"message": "Yeah, so I think we already have a way to make sure the ceph.conf gets copied from the master node to the subnode:\n\nhttps://opendev.org/openstack/devstack/src/branch/master/roles/sync-controller-ceph-conf-and-keys/tasks/main.yaml\n\nAFAIK, that\u0027s how it happens for the current package-based case. Doing this manually in this code is likely to be less reliable I think.",
"parentUuid": "72f57bfb_0c379175",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "4d83021d_800b9904",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 515,
"author": {
"id": 32594
},
"writtenOn": "2023-05-23T17:02:22Z",
"side": 1,
"message": "So the thought process here is that when the master and subnode are created in parallel like in the case of zuul ci, I need to make sure the node with manila (subnode) waits until ceph is ready on master. this is more to have a waiter than to copy ceph.conf. though that does happen later in post-config",
"parentUuid": "897ee047_856587a5",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": true,
"key": {
"uuid": "2e17fb88_513b31db",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 515,
"author": {
"id": 4393
},
"writtenOn": "2023-05-23T17:32:30Z",
"side": 1,
"message": "Yeah I understand that\u0027s what this code inside the function is doing. What I was saying is I don\u0027t see that it actually gets called anywhere. However, I see the call is in `plugin.sh`.",
"parentUuid": "4d83021d_800b9904",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
},
{
"unresolved": false,
"key": {
"uuid": "0066e3c6_253b3f2d",
"filename": "devstack/lib/cephadm",
"patchSetId": 25
},
"lineNbr": 515,
"author": {
"id": 32594
},
"writtenOn": "2023-07-24T19:07:16Z",
"side": 1,
"message": "Done",
"parentUuid": "2e17fb88_513b31db",
"revId": "94b582261cc795407188a0479439f36715210691",
"serverId": "4a232e18-c5a9-48ee-94c0-e04e7cca6543"
}
]
}